Send to

Choose Destination
Epidemiology. 2016 Jan;27(1):105-15. doi: 10.1097/EDE.0000000000000403.

Evaluating the Validity of a Two-stage Sample in a Birth Cohort Established from Administrative Databases.

Author information

From the aEpidemiology and Biostatistics Unit, INRS-Institut Armand-Frappier, Université du Québec, Laval, QC, Canada; bRespiratory Epidemiology and Clinical Research Unit, McGill University Health Centre, Montreal, QC, Canada; and cDepartment of Epidemiology, Biostatistics and Occupational Health, Faculty of Medicine, McGill University, Montreal, QC, Canada.



When using administrative databases for epidemiologic research, a subsample of subjects can be interviewed, eliciting information on undocumented confounders. This article presents a thorough investigation of the validity of a two-stage sample encompassing an assessment of nonparticipation and quantification of the extent of bias.


Established through record linkage of administrative databases, the Québec Birth Cohort on Immunity and Health (n = 81,496) aims to study the association between Bacillus Calmette-Guérin vaccination and asthma. Among 76,623 subjects classified in four Bacillus Calmette-Guérin-asthma strata, a two-stage sampling strategy with a balanced design was used to randomly select individuals for interviews. We compared stratum-specific sociodemographic characteristics and healthcare utilization of stage 2 participants (n = 1,643) with those of eligible nonparticipants (n = 74,980) and nonrespondents (n = 3,157). We used logistic regression to determine whether participation varied across strata according to these characteristics. The effect of nonparticipation was described by the relative odds ratio (ROR = ORparticipants/ORsource population) for the association between sociodemographic characteristics and asthma.


Parental age at childbirth, area of residence, family income, and healthcare utilization were comparable between groups. Participants were slightly more likely to be women and have a mother born in Québec. Participation did not vary across strata by sex, parental birthplace, or material and social deprivation. Estimates were not biased by nonparticipation; most RORs were below one and bias never exceeded 20%.


Our analyses evaluate and provide a detailed demonstration of the validity of a two-stage sample for researchers assembling similar research infrastructures.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Wolters Kluwer
Loading ...
Support Center