Send to

Choose Destination
See comment in PubMed Commons below
Stat Med. 2000 Jan 15;19(1):99-111.

Utilization of multiple imperfect assessments of the dependent variable in a logistic regression analysis.

Author information

Department of Epidemiology and Preventive Medicine, University of Maryland School of Medicine Baltimore, MD 21201, USA.


Often, in biomedical research, there are multiple sources of imperfect information regarding a dichotomous variable of interest. For example, in a study we are conducting on the relationship between cocaine use and stroke risk, information on the cocaine use of each study patient is available from three fallible sources: patient interviews; urine toxicology testing, and medical record review. Regression analyses based on a rule for classifying patients from this information can result in biased estimation of associations and variances due to the misclassification of some subjects and to the assumption of certainty. We describe a likelihood-based method that directly incorporates multiple sources of information regarding an outcome variable into a regression analysis and takes into account the uncertainty in the classification. The method can be applied when some sources of information are missing for some subjects. We show how the availability of multiple sources can be exploited to generate estimates of the quality (for example, sensitivity and specificity) of each source and to model the degree to which missing data are informative. A fitting algorithm and issues of identifiability are discussed. We illustrate the method using data from our study.

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center