U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Mack C, Su Z, Westreich D. Managing Missing Data in Patient Registries: Addendum to Registries for Evaluating Patient Outcomes: A User’s Guide, Third Edition [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2018 Feb.

Cover of Managing Missing Data in Patient Registries

Managing Missing Data in Patient Registries: Addendum to Registries for Evaluating Patient Outcomes: A User’s Guide, Third Edition [Internet].

Show details

Considerations for Reporting Findings From Studies With Missing Data

Reporting Guidelines

Missing data is common in patient registries and, depending on the extent and type of missing data, may affect the interpretation of results. As such, documenting how missing data were addressed when reporting registry findings is important in order to provide transparency and to allow readers to accurately interpret registry findings. To this end, two useful guidelines are the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and the Patient-Centered Outcomes Research Institute (PCORI) Methodology Report.

The STROBE statement consists of a checklist of 22 items to address in reports of observational studies and includes missing data as one of the essential items to document. In the accompanying explanation, the STROBE authors provide the following guidance:

“We advise that authors report the number of missing values for each variable of interest (exposures, outcomes, confounders) and for each step in the analysis. Authors should give reasons for missing values if possible, and indicate how many individuals were excluded because of missing data when describing the flow of participants through the study. For analyses that account for missing data, authors should describe the nature of the analysis (e.g., multiple imputation) and the assumptions that were made (e.g., missing at random).”30

The PCORI Methodology Report, which describes standards for the conduct of patient-centered outcomes research, places a particular emphasis on missing data, with an entire section and five standards devoted to the topic. While the STROBE Statement focuses on items to include in reports of study findings, the PCORI standards address prevention of missing data, analytic approaches for addressing missing data, and reporting findings from studies with missing data. In addition to the items covered by the STROBE Statement, the PCORI standards include a requirement that investigators should consider the potential for missing data when developing a study protocol and plan for appropriate steps to minimizing the likelihood of missing data. Expected rates of missing data should also be set at the study outset, with comparisons made to actual rates of missing data during the analysis phase. Conducting sensitivity analyses (see sections below) are also considered a mandatory component of study analysis and reporting in the PCORI standards, as are comparisons of the baseline characteristics of patients with or without missing data. In terms of reporting results, the PCORI standards require the information included in the STROBE Statement, plus a discussion of the potential impact of both the extent of missing data and the approach used to address missing data and incorporation of this information into the interpretation of the study findings.31

Recommended Sensitivity Analyses

In addition to the above approaches, “scenario-based” sensitivity analyses should be considered for missing data. Investigators can identify “worst case” scenarios for the missing data: for missing outcomes, one such “worst case” scenario might be to assume that all exposed missing outcomes are events, while all unexposed missing outcomes are nonevents (or vice-versa). Such scenario-based approaches can help set boundaries on causal effect size in ways that are useful for contextualizing main results. However, since scenario-based analyses are by their nature specific to the data and situation under study, it is important to consider carefully what questions are of most substantive relevance to the study question at hand. Ideally, sensitivity analyses using different analytic approaches for missing data should be pre-specified in the protocol or a separate data analysis plan, and not done post-hoc.

Missing Potential Outcomes and Causal Inference

To put the issue of missing data into perspective, it is useful to remember that, from the perspective of potential outcomes, causal effects can be defined as the expectation (over a population) in a contrast in individual potential outcomes – for example, the average risk of an outcome if the entire population had been exposed, contrasted with the average risk of an outcome if the entire population had been unexposed.18 The central problem of causal inference is that it is not possible to observe more than a single potential outcome for any individual under study: that is, it is possible to observe what happens when an individual is exposed to X=1, but not to X=0 (or to any other value of X). As such, the central problem of causal inference is a problem of missing potential outcomes, a missing data problem.

Thus, there are numerous parallels between techniques for missing data and techniques for “regular” regression analysis to deal with confounding, including close parallels between assumptions like MCAR/MAR/MNAR and confounding.


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (288K)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...