Send to

Choose Destination
Stat Med. 2003 Feb 28;22(4):517-34.

Missing data in the 2 x 2 table: patterns and likelihood-based analysis for cross-sectional studies with supplemental sampling.

Author information

Department of Biostatistics, The Rollins School of Public Health of Emory University, 1518 Clifton Rd. N.E., Atlanta, GA 30322, U.S.A.


Standard measures of crude association in the context of a cross-sectional study are the risk difference, relative risk and odds ratio as derived from a 2x 2 table. Most such studies are subject to missing data on disease, exposure, or both, introducing bias into the usual complete-case analysis. We describe several scenarios distinguished by the manner in which missing data arise, and for each we adjust the natural multinomial likelihood to properly account for missing data. The situations presented allow for increasing levels of generality with regard to the missing data mechanism. The final case, quite conceivable in epidemiologic studies, assumes that the probability of missing exposure depends on true exposure and disease status, as well as upon whether disease status is missing (and conversely for the probability of missing disease information). When parameters relating to the missing data process are inestimable without strong assumptions, we propose maximum likelihood analysis subsequent to collecting supplemental data in the spirit of a validation study. Analytical results give insight into the bias inherent in complete-case analysis for each scenario, and numerical results illustrate the performance of likelihood-based point and interval estimates in the most general case. Adjustment for potential confounders via stratified analysis is also discussed.

[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center