Skip to main content
What is Health and Human Development?

Diverse fields of study that share one
common goal: enriching the lives of others.

Search search
Mobile Search:
illustration of a chain comprising blue dots of light with one red chain link that is breaking
When data is missing in a research study, it can sometimes jeopardize the quality of the results, so researchers study ways to account for missing data.

In many fields, when scientists perform research, some of the data they want to measure is missing. When researchers study people over time—for example wanting to know how parenting styles during high school affect drinking during college—there is likely to be a large amount of missing data. Some people might not answer certain questions, or they might complete a survey on one occasion but not on another. This missing data creates a problem for researchers, which researchers in the Penn State Department of Human Development and Family Studies are working to solve. 

Researchers could simply omit people who have missing data, but that could lead to a less-accurate study because there might be something different about people who did not answer certain questions or who missed one of the surveys.  

For example, if a researcher is studying employee wellness, and they only collect data from employees when they are in the office, they will not have responses from anyone who was too sick to come to work. The people with missing data would be different than the people who responded, and so the results of the research would be inaccurate.  

In new research published in Multivariate Behavioral Research, the researchers proposed a new method for handling missing data in specific circumstances.  

Sy-Miin Chow

We can leverage data that were collected earlier to make inferences about current behavior and change, even when there are missing data.

Sy-Miin Chow

“When someone misses a day of responding, such as when they are on vacation, they often miss not just one questionnaire but all of the questionnaires and for several days in a streak,” said Sy-Miin Chow, professor of human development and family studies and coauthor on this research. “But why? Is that person sick, having a difficult time in life, traveling? This could matter to the study and should not be ignored. Knowing what the person was like in the past tells us something about what they are like now, and why they might miss some days’ responses. 

“This paper proposes that the mechanisms triggering missingness may be shared across multiple data streams or over time.” she continued. “This means we can leverage data that were collected earlier to make inferences about current behavior and change, even when there are missing data.” 

Chow, Zita Oravecz, associate professor of human development and family studies, graduate student Yangling Li, and a colleague from Montana State University collaborated on this project. They examined how to handle missing data with multiple measurements of the same people over a long period of time. Specifically, they were examining instances when certain items in a measurement were missing on multiple occasions.  

Zita Oravecz

Imagine someone struggling with alcohol addiction who is using a smartphone app to record their cravings. What if they skip surveys when their cravings are the highest? 

We need to understand missing data so that we can better understand and better help people.

Zita Oravecz

The researchers investigated whether it was practical to incorporate factor scores — a number derived by combining information to understand another question — into a technique called multiple imputation. In multiple imputation, the value of the missing items is estimated many times until the missing data can be inferred.  

They found that incorporating factor scores into multiple imputation could be useful. Though this has little impact on day-to-day life for non-scientists, it could be significant for researchers who study how people change over time.  

“When data is missing, the researcher still has other sources of information from which they can construct their own understanding,” Oravecz said. “And the missing data could be the most important data. Imagine someone struggling with alcohol addiction who is using a smartphone app to record their cravings. What if they skip surveys when their cravings are the highest? We need to understand missing data so that we can better understand and better help people.” 

Originally published in October 2024.