Select Page

Missing data

Missing data is a phenomenon that happens in almost all studies whether in observational or clinical trials.  It most often occurs in studies that involve long-term follow up, where people miss different follow up visits or exit the study for various reasons.   This results in problems with analyzing data, especially if there is a lot of missingness or if there is a pattern to the missingness and it did not occur randomly.

Missing data happened in COVID-19 data and continues to do so.  Initially missing data happened in the beginning when it was hard to collect all measurements on patients due to various factors like large patient influx and patient overflow.  Eventually collecting data on COVID-19 patients became better when capacity was reduced and the handling of these patients was better understood.  Now another area where missingness is seeming to happen is in COVID-19 vaccination programs.  Recently the New York Times reported on April 25, 2021 that nearly 8% of people who received their 1st dose of Pfizer or Moderna (which is in the millions) are missing their 2nd vaccine dose due to various reasons like skepticism, fear, and lack of supply.  If researchers want to analyze this follow up data on patients receiving the vaccine then how will they deal with this missingness on the 2nd dose?   Will this follow a pattern of missing like missing at random?

In general, handling missing data is a central theme in data analysis that cannot be ignored0 but also presents a challenge in COVID-19 studies, since all this is new and it hard to know how to handle.  At this time, it is hard to ascertain patient behaviors or link patient actions to one single reason in order to understand the missingness.  However, in order to perform any kind of imputation of missing data then one does need to understand if there were patterns or if the missingness was skewed toward a particular group. These reasons could lead to the data to be considered not missing at random.  Once statisticians start to analyze the vaccination programs for outcomes and have the data with many missing the 2nd dose, they will have to decide how practically to handle such data, but ultimately, one has to decide about the missingness in order to plan analysis, and these are the tough decisions to make.

Written by Usha Govindarajulu, PhD