About the MCAP Summer Institute
Our week-long Summer Institute on Longitudinal Data Analysis is designed to meet the needs of 80 participants each year, welcoming individuals from all career stages and backgrounds (e.g., graduate students, post-docs, faculty, industry researchers, nonprofit and public sector workers, etc.) — all of whom are eager to enhance their knowledge in longitudinal data analysis. Our Summer Institute will be held at Purdue University in West Lafayette, IN and we will provide lodging in Purdue dorms for those selected to receive travel funding as well as per diems to cover food costs. The Summer Institute is ideal for individuals with a foundational understanding of statistics who seek to learn and apply longitudinal methods in their work. We specifically encourage applicants who are not already experts in longitudinal data analysis but who see the potential for these skills to enhance their research or professional contributions.
For Participants
- Who would benefit most from this course?
- The Summer Institute is ideal for individuals with a strong foundational understanding of statistics and regression-based methods, who seek to learn and apply longitudinal methods in their work. We specifically encourage applicants who are not already experts in longitudinal data analysis but who see the potential for these skills to enhance their research or professional contributions and who have planned or active projects using longitudinal data. Some familiarity with longitudinal data and/or analysis is useful, but not required.
- A full schedule of 2026 course topics is included in the table below
- It may also be helpful to look at the course materials from 2025 at the bottom of this page to determine if you’d benefit from the summer institute—but note we have updated topics for 2026
- When is the course?
- July 12th-July 17th, 2026
- We are accepting applications as of February 1, 2026
- Where?
- Purdue University in West Lafayette, Indiana
- How much?
- $2,000 for the Registration Fee for applicants external to Purdue; $1,500 for applicants from Purdue
- The registration fee is fully covered for those who receive scholarships (see below)
- $2,000 for the Registration Fee for applicants external to Purdue; $1,500 for applicants from Purdue
- Scholarships
- With generous support from the National Institute on Drug Abuse (R25 DA061822), we are able to provide 40 participants with full support for living expenses, registration fees, and travel costs. We aim for these funds to cover all costs of attending the summer institute.
- With generous support from the National Institute on Drug Abuse (R25 DA061822), we are also able to provide support for registration fees for an additional 30 participants. These are typically Purdue affiliated individuals and/or those local who do not require lodging or travel expenses.
- Learning Outcome Goals
- 1) Expand interest and comfort applying longitudinal data to health and social science questions
- 2) Increase understanding of mastery of longitudinal data and models, including developing skills to make justified measurement and modeling decisions
- 3) Provide tools for data visualization for broad application of longitudinal analysis and dissemination of findings to interdisciplinary audiences
- How do I apply?
- We are currently accepting applications
- The deadline to apply is March 15
- The application will be used to select participants and to determine scholarship recipients
- We ask about your experience with a variety of quantitative methods and statistics topics
- We also ask for short statements about your reasons for attending the summer institute, your current level of methods/statistics expertise, longitudinal research project plans/interest, and interdisciplinary experience/interest
- Click this link to apply
- We are currently accepting applications
Schedule: Summer Institute 2026
| Day | Topic | Instructor | Session Goals / Learning Objectives |
| SUNDAY | INTRODUCTION TO R | ||
| 11:30-12:00 WALC 1132 | Registration; snacks | ||
| 12:00-4:00 1:20-1:30 break 2:40-2:50 break WALC 1132 | 0: Intro to R for longitudinal data management (optional) | Dr. Katie Thompson | Introduction to the tools for efficiently managing data, unique data structures for repeated measures, tutorial for using various identifiers, transposing between long and wide formats |
| 4:30-6:30 Purdue Memorial Union (PMU) | Welcome Reception – Heavy appetizers and drinks provided Registration | ||
| MONDAY | UNDERSTANDING YOUR LONGITUDINAL DATA | ||
| 7:30-8:00 WALC 1018 | Registration; light breakfast, snacks, and coffee | ||
| 8:00-8:30 WALC 1018 | Introduction to the Summer Institute | Dr. Trent Mize & Dr. Kristine Marceau | Overview of course structure, events, and opportunities |
| 8:30-9:30 WALC 1018 9:30-9:45 break | 1. Introduction to Longitudinal Data Structures | Dr. Kristine Marceau | Long and wide data structures; combining data sources to study multilevel determinants and contextual effects |
| 9:45-10:45 WALC 1018 10:45-11:00 break | 2: Preparing your longitudinal data | Dr. Sharon Christ | Time-varying vs time-invariant variables; measured and/or measurable variables; merging data |
| 11:00-12:00 WALC 1018 | 3 Part A: Missing data in longitudinal datasets | Dr. James McCann | Sources of missing data in longitudinal studies (e.g., attrition, measure-specific), how to assess missing data patterns |
| 12:00-1:15 | Lunch Break | ||
| 1:15-2:15 WALC 1018 2:15-2:30 break | 3 Part B: Missing data in longitudinal datasets | Dr. James McCann | Introductory overview of missing data techniques available for longitudinal data |
| 2:30-4:30 WALC 1018 | 4: Data Visualization: getting to know your data | Dr. Trent Mize | Visualizing raw data (e.g., distributions, missing data, etc.). Incorporating longitudinal information in visualizations. Overview of best practices |
| 4:30-5:30 WALC 1018 | Office hours | Faculty instructors and TAs | Assignment 1: getting to know your longitudinal data; missing data considerations; visualizations |
| TUESDAY | OVERVIEW OF MODELS FOR LONGITUDINAL DATA | ||
| 8:00-8:30 WALC 1018 | Light breakfast, snacks, and coffee | ||
| 8:30-9:00 WALC 1018 | Analyses in R / overview of assignment 1 | TA | Implement the prior days topics in R. Overview assignment 1 |
| 9:00-12:00 10:30-10:45 break WALC 1018 | 5: Introduction to longitudinal data analytic methods | Dr. Rob Duncan | Broad goals and types of research questions and hypotheses applicable to large, complex longitudinal data, temporal ordering and causal inference, understand the classes of common longitudinal data analysis techniques |
| 12:00-1:15 | Lunch Break | ||
| 1:15-2:15 WALC 1018 2:15-2:25 break | 6: Longitudinal model typologies | Dr. Shawn Bauldry | Synthesizing terminology for longitudinal models; a framework for understanding estimators for longitudinal data |
| 2:25-3:25 WALC 1018 3:25-3:35 break | 7: Related models and quasi-longitudinal models | Dr. Kristine Marceau | A brief overview of related models we do not cover in depth at the summer institute: age-period-cohort analyses; models for time series data or intensive longitudinal designs; survival models. |
| 3:35-4:35 WALC 1018 | 8: Measurement error | Dr. James McCann | Overview of measurement error; specific concerns for longitudinal data; solutions |
| 4:35 – 5:30 WALC 1018 | Office hours | Faculty instructors and TAs | Assignment 2: fit core longitudinal models to your data; practice interpretation |
| WEDNESDAY | FIXED EFFECTS MODELS AND COMPLICATIONS | ||
| 8:00-8:30 WALC 1018 | Light breakfast, snacks, and coffee | ||
| 8:30-9:00 WALC 1018 | Analyses in R / overview of assignment 2 | TA | Implement the prior days topics in R. Overview assignment 2 |
| 9:00-12:00 10:15-11:30 break WALC 1018 | 9: Fixed effects models | Dr. Shawn Bauldry | Fixed effects models; time varying vs. time-invariant covariates; lagged variable predictors |
| 12:00-1:15 | Lunch Break | ||
| 1:15-4:15 2:30-2:45 break WALC 1018 4:15-4:30 break | 10: Complications: nonlinearities, categorical outcomes, moderation, and mediation | Dr. Trent Mize | Complications: modeling nonlinear effects and categorical outcome variables; analyses of moderation (interaction) and mediation (and other cross-model comparisons) |
| 4:30-5:30 WALC 1018 | Office hours | Faculty instructors and TAs | Assignment 3: fit fixed effects models to your data and interpret; add a complication to your model and interpret |
| THURSDAY | MULTILEVEL MODELS AND MARGINAL MODELING | ||
| 8:00-8:30 WALC 1018 | Light breakfast, snacks, and coffee | ||
| 8:30-9:00 WALC 1018 | Analyses in R / overview of assignment 3 | TA | Implement the prior days topics in R. Overview assignment 3 |
| 9:00-12:00 10:30-10:45 break WALC 1018 | 11: Multilevel modeling / random effects models | Dr. Kristine Marceau | Overview of multilevel and random effects models; growth curve models; cross-lagged panel models; studying multilevel determinants and contextual effect |
| 12:00-1:15 | Lunch Break | ||
| 1:15-2:15 WALC 1018 2:15-2:30 break | 12: Sampling weights | Dr. Donna Xu | When to use survey weights for analysis; issues of sample attrition |
| 2:30-4:30 WALC 1018 | 13: Marginal modeling using complex samples | Dr. Sharon Christ | Accounting for complex sampling techniques in longitudinal datasets |
| 4:30-5:30 WALC 1018 | Office hours/Open Consulting | Faculty instructors and TAs | – Assignment 4: fit a multilevel model to your data and interpret; account for complex sampling and compare results – Consult with TAs and instructors about projects you are working on |
| FRIDAY | SPECIAL TOPICS: GENE-ENVIRONMENT INTERPLAY, CAUSAL INFERENCE, AND MODEL VISUALIZATION | ||
| 8:00-8:30 WALC 1018 | Light breakfast, snacks, and coffee | ||
| 8:30-9:00 WALC 1018 | Analyses in R / overview of assignment 4 | TA | Implement the prior days topics in R. Overview assignment 4 |
| 9:00-10:00 WALC 1018 10:00-10:10 break | 14: Gene-environment interplay | Dr. Kristine Marceau | Overview of behavioral genetics theory; using polygenic scores as predictors; family-based designs; longitudinal considerations for studying gene-environment interplay |
| 10:10-11:10 WALC 1018 11:10-11:20 break | 15: Causal Inference | Dr. Rob Duncan | Asking causal questions; benefits and limitations of longitudinal data for determining causality; comparing models |
| 11:20-12:20 WALC 1018 | 16 Part A: Embedded family-based designs | Dr. Kristine Marceau | Understand why many large-scale longitudinal studies include embedded family-based designs (e.g., twins/siblings; data collected on parents and children); gain the tools to avoid non-independence in these types of studies |
| 12:20-1:35 | Lunch Break | ||
| 1:35-2:35 WALC 1018 2:35-2:45 break | 16 Part B: Embedded family-based designs | Dr. Kristine Marceau | Gain an introductory understanding of options for leveraging family-based subsets to inform research questions along with resources for more in-depth instruction; causal considerations |
| 2:45-4:30 WALC 1018 | 17: Model visualization | Dr. Trenton Mize | Visualizing model results; presenting complex results in an accessible way; coefficient plots; plots of predictions and marginal effects |
| 4:30-5:30 WALC 1018 | Office hours/Open Consulting | Faculty instructors and TAs | – Assignment 5: fit a model to a family-based subsample; interpret; identify causal inference benefits and limitations of your model – Consult with TAs and instructors about projects you are working on |
| 5:30-7:30 Marriot Hall, John Purdue Room | Closing Reception – Heavy appetizers and drinks provided | ||
- CITI Training (Human Research Protection Program)
- If you have not taken it before or yours is expired: complete the Biomedical Research for Investigators or Social Behavioral Research group, and then the Human Subjects Research – Initial (Basic) course. If you completed training within the past 4 years, you may take a refresher. If your certificate is current, no action is required.
- Office Hours: Each day immediately following the end of formal instruction, we will hold in-person office hours. Every day, the TAs and that day’s instructors will be available to answer questions on the day’s topics and the daily applied assignment. On Thursday and Friday, we will also include a table of faculty instructors available for open consulting about your research projects. Come with questions — we are excited to help!
- Slack:
- We will be actively monitoring Slack during the course and in the evenings to answer any questions. We encourage participants to post R- or Stata-related and assignment questions on Slack so as not to interrupt class sessions. You can join our Slack workspace using the link provided in your registration email.
- GENERAL EVENT INFORMATION
Campus Parking | Purdue Parking Map
Northwestern Parking Garage
This garage has limited Parkmobile spots or attendees can purchase their own A-Permit during their stay.
Grant Street Garage
This garage is ticketed parking – attendees pay for parking when exiting the garage. *NOTE – For those staying in the Residence Hall: Parking is located on the top floor of the parking garage in First Street Towers. These spots are free of charge.
Resident Hall Dining | Earhart Dining Court
1275 1st Street, West Lafayette, IN 47906
*NOTE – If you are not staying at the dorms during your visit, you may eat at the dining hall. You will pay at the door for your meal every time you go to Earhart Dining Court.
Dining Hours
Breakfast: 7:00 am – 8:30 am
Lunch: 11:00 am – 1:30 pm
Dinner: 5:00 pm – 7:00 pm
Find Information Regarding Dietary Restrictions HERE.
Classroom Space | Wilmeth Active Learning Center (WALC), Room 1132
340 Centennial Mall Dr., West Lafayette, IN 47907
Reception Location | Purdue Memorial Union, West Faculty Lounge
201 Grant St., West Lafayette, IN 47906
WIFI | AT&T WIFI & Eduroam
AT&T WIFI – This is a free connection that does not require any credentials to sign into and can be used by anyone on campus
Eduroam – A secure, world-wide roaming access service developed for education and research communities. The credentials to sign into Eduroam are generally your home Universities account.
Registration | Locations & Times
Sunday, July 13 (11:30-12:30PM) | WALC 1132
Sunday, July 13 (4:30-5:30PM) | Purdue Memorial Union, West Faculty Lounge
Monday, Jul 14 (7:45-10AM) | WALC 1132
| Lodging | First Street Towers |
| 1250 1st Street, West Lafayette, IN 47906 |
| PARKING | McCutcheon Drive Parking Garage McCutcheon Drive Parking Garage |
| *NOTE – Residence Hall Parking is located on the top floor of the parking garage. Please park in one of these parking spaces. These spots are free of charge but limited. Attendees are able to park in any residence hall parking spots. |
| View First Street Towers Orientation HERE before your check-in date. |
| DORM CHECK-IN | 11AM-5PM EST |
| DORM CHECK-OUT | 12PM EST |
IMPORTANT INFORMATION The front desk will be staffed for a majority of the week, however it is not a guarantee that someone will be at the desk 24/7. If you are needing to check-in after hours and someone is not at the front desk, please direct yourself to Meredith South – staff will be on site to assist. You may also call (765)496-5150. |
Meredith South Residence Hall 1225 1st Street, West Lafayette, IN 47906 Main entrance doors to First Street Towers lock at 11PM EST and unlock at 6AM EST every day. To enter the dorm after hours, please use your key card. |
| Be sure to pick up your folder at the front desk when checking-in! |
| Explore the Greater Lafayette Area! |
| Events Calendar – Purdue University Events Calendar Local Eateries and Activities – Home of PurduePurdue Memorial Union Dining | there are several options for dining in the PMU. See the campus options HERE. The below document shows events from 2025. Check it out for a sense of what there is do around greater Lafayette. We will update the document for 2026 before the summer institute and post it here. |
Featured Faculty ANd Teaching Assistants
- Dr. Kristine Marceau (MCAP Co-Director): Dr. Marceau is an Associate Professor of Human Development and Family Science who specializes in longitudinal methods emphasizing both developmental change and variability across multiple time-scales using and integrating SEM and multilevel modeling techniques. She frequently uses family-based designs and large datasets to explore developmental and behavioral trajectories. Dr. Marceau regularly teaches multilevel modeling and inferential statistics, and trains students in longitudinal data analysis.
- Dr. Trenton D. Mize (MCAP Co-Director): Dr. Mize is the Dean’s Associate Professor of Sociology and Statistics (by courtesy) and a quantitative methodologist with expertise in categorical data analysis, experimental design, latent variable modeling, and data visualization. His research develops and applies innovative methods for analyzing complex social data, and he regularly teaches graduate courses on categorical data, experimental design, and data visualization.
- Dr. James A. McCann (MCAP Co-Director): Dr. McCann is a Professor of Political Science with expertise in longitudinal survey analysis and latent variable modeling. He has led multiple large-N longitudinal studies on political behavior and representation and regularly applies advanced econometric and multilevel techniques in his research. Dr. McCann teaches graduate seminars on research design and quantitative analysis, focusing on panel data and survey methodologies.
- Dr. Sharon Christ (MCAP Co-Director): Dr. Christ is an Associate Professor of Human Development and Family Science specializing in emergent statistical models, particularly structural equation modeling (SEM) and complex sample designs. Her expertise in multilevel modeling, SEM, and growth models has been applied across numerous large-scale cohort studies. She has taught graduate-level courses on sample design, inferential statistics, and SEM.
- Dr. Robert Duncan: Dr. Duncan is an Associate Professor of Human Development and Family Science at Colorado State University with expertise in advanced longitudinal data analysis, including multilevel modeling, structural equation modeling (SEM), and growth curve modeling. His work focuses on children’s development within multilevel contexts like classrooms.
- Dr. Donna Xu: Dr. Xu, an Associate Professor in the School of Nursing, specializes in longitudinal cohort studies that evaluate the quality of care and outcomes for older adults. Her expertise spans applied biostatistics, epidemiological methods, and outcome evaluation. She regularly teaches graduate courses in these areas, incorporating advanced quantitative techniques into her instruction, such as weighting methods and sampling designs.
- Dr. Shawn Bauldry: Dr. Bauldry, a Professor of Sociology at Purdue, specializes in quantitative methods and statistics, primarily focusing on the development of structural equation models, a broad class of statistical models with wide applicability in the social sciences.
- Dr. Katie Thompson: Dr. Thompson is a postdoctoral researcher in the Department of Sociology. Her work intersects psychiatry, genomics, and sociology, using innovative statistical approaches to integrate large-scale longitudinal data to better understand mental health. She has specialized in complex longitudinal designs using structural equation models, multilevel and matrix-based mixed models, and genetic and family data. Dr Thompson has taught on MSc statistics courses and led multiple R intensive workshops focused on family data at King’s College London. She has led on projects using multiple longitudinal cohort studies across the USA and UK and has focused on creating open and reproducible code and analytical pipelines.
2025 Teaching Assistants listed below. We are now accepting applications for 2026 teaching assistants. Email mcap@purdue.edu if you are interested in working with us.
- Mallory Bell (Sociology): Mallory Bell is a dual-title PhD candidate in Sociology and Gerontology at Purdue University. Her research uses longitudinal data analysis to examine how social determinants of health help shape trajectories of well-being in later life.
- Bing Han (Sociology): Bing Han is a dual-title Ph.D. candidate in Sociology and Gerontology at Purdue University, where she has also earned graduate certificates in Applied Statistics and Advanced Methodology. Her research focuses on health behaviors and lifestyles, stigma, and aging, employing a wide range of methodological approaches, including machine learning, categorical data analysis, longitudinal modeling, latent variable analysis, and experimental design.
- Susmita Ghosh (Public Health): Susmita Ghosh is a PhD candidate in the Department of Nutrition Science at Purdue University, specializing in nutritional epidemiology with a focus on maternal and infant nutrition and food environments. Her research integrates advanced statistical techniques—including multilevel modeling, longitudinal analysis, and causal inference methods—to evaluate randomized controlled trials and social and behavior change interventions aimed at improving health and nutritional outcomes in low-resource settings.
- Yi Zhu (Education): Yi Zhu is a fifth-year PhD candidate in Mathematics Education at Purdue University. Her research focuses on early mathematics learning, spatial reasoning, and game-based learning environments, employing both quantitative and qualitative (mixed-methods) approaches to understand how children develop mathematical thinking.
- Amy Loviska (HDFS): Amy Loviska is a PhD candidate in the Human Development and Family Science at Purdue University. Their research program applies advanced quantitative longitudinal methods alongside community-engaged qualitative work to understand effects of individual biology (i.e., hormones, genetics), proximal environments (i.e., prenatal, parents, peers), and sociocultural macroenvironments on adolescent substance use progression for diverse gender and race-ethnic background youth.
- Catalina Vega Mendez (Political Science): Catalina Vega Méndez is a Ph.D. Candidate in the Department of Political Science at Purdue University. Her research focuses on comparative political behavior and migration policy, with a regional emphasis on Latin America. She studies public attitudes and policy responses to international migration using a range of methodological tools with expertise in difference-in-differences designs, as well as the analysis of international longitudinal survey and panel data.
2025 Course Materials
All course materials from the 2025 Summer Institute are included below. 2026 course materials will be posted to this site shortly before this year’s institute.
Contemporary large-scale NIH initiatives have led to the emergence of many high-quality publicly available longitudinal datasets that that include complex data of various types, sources, and domains (e.g., biological, social, individual, family, neighborhood, etc.). However, use of these datasets without training can lead to scientific setbacks, including work that is imperfect, misleading, or even incorrect. There is an urgent need for educational programming to train researchers both within and outside of academic careers on the innovative and responsible use of publicly available, large, and complex longitudinal datasets. This R25 grant develops and offers an “Interdisciplinary Summer Institute on the Analysis of Complex, Large-Scale Longitudinal Data”, refining it each year based on evaluation data (aim 1). We will also leverage this program to train graduate students to teach advanced longitudinal methods to participants from multiple disciplines (aim 2). Thus, we will serve two groups: program participants (aim 1), and Purdue graduate student teaching assistants (TAs, aim 2). During an immersive week-long summer institute each year, we will train 50 interdisciplinary participants including students, postdocs and faculty across academic institutions (Y1-Y3), expanding to also include professionals in non-profits, governmental agencies, and industries (Y2, Y3). The course is organized in 10 topics: publicly available longitudinal data sources, introduction to longitudinal data analytic methods, data visualization, missing data, longitudinal categorical data analysis, sampling weights and clustering/ stratification, time varying and time-invariant covariate inclusion, combining multiple data sources, embedded family-based designs, and an intro to sociogenomics—emphasizing cross-cutting themes of data management, visualization and communication, causal inference, measurement and modeling decisions, meaningful effect sizes, and representativeness. Lecture examples and assignments will focus on substance use and associated factors and will use the Adolescent Brain and Cognitive Development study data, although participants will be encouraged to use whatever dataset is most relevant to their own research interests. The summer institute will also feature TAs and additional faculty instructors circulating the room in each session to support students in need of extra assistance in real-time, as well as review and office hour sessions, experience in interdisciplinary environments, networking, and joint practice opportunities to help establish collaborations. We will also train 6 graduate student TAs each year, who will gain supervised experience in content development, instruction (via review sessions), consulting, course evaluation, and leadership within interdisciplinary environments. We have carefully designed recruitment strategies to train a diverse (e.g., under-represented groups, discipline, and career stage and path) workforce, and a multi-pronged evaluation plan. Our program faculty includes 8 faculty experts in longitudinal data analysis and instruction, representing different fields, genders, and career stages.