Send to

Choose Destination
Lung Cancer. 2019 Aug;134:16-24. doi: 10.1016/j.lungcan.2019.05.016. Epub 2019 May 16.

Misclassification of the actual causes of death and its impact on analysis: A case study in non-small cell lung cancer.

Author information

Department of Biostatistics and Epidemiology, Memorial Sloan Kettering Cancer Center, 485 Lexington Ave, 2(nd) Floor, New York, NY, 10017, United States. Electronic address:



Cumulative incidence of lung cancer deaths (LC-CID) is an important metric to understand cancer prognosis and to determine treatment options. However, credible estimates of LC-CID rely on accurate cause-of-death coding in death certificates. Results from lung cancer screening trials estimated 15% under-reporting and 1% over-reporting of lung cancer deaths due to misclassification. This study investigated the impact of cause-of-death misclassification on the estimation of LC-CID.


Patients with stage I/II non-small cell lung cancer (NSCLC) from the Surveillance, Epidemiology, and End Results registry were included. LC-CID was estimated using the competing-risk approach in two ways: (1) reporting observed estimates that ignore potential cause-of-death misclassification and (2) correcting for plausible misclassification rates reported in the literature (15% under-reporting and 1% over-reporting). Bias was quantified as the difference between observed and corrected 10-year LC-CIDs: positive values indicated that observed LC-CID overestimated true LC-CID, whereas negative values indicated the opposite.


Among 66,179 patients, the impact of over-reporting on 10-year LC-CID was negligible across all age groups. In contrast, under-reporting resulted in substantial underestimation of 10-year LC-CID. The biases increased as age increased due to higher LC-CIDs: 10-year LC-CIDs among stage I patients 18-44, 45-59, 60-74 and ≥75 years were 25%, 32%, 41%, and 50%, respectively, and the corresponding biases given the plausible misclassification rates were -4.4%, -5.6%, -7.1%, and -8.6%. Because the observed LC-CIDs among patients with stage II disease were higher than those with stage I disease, the biases were greater among stage II patients, up to -12.5% in the oldest age group.


In lung cancer, LC-CID may be severely underestimated due to under-reporting of lung cancer deaths, particularly among older patients or those with late-stage disease. Future studies that involve such subpopulations should present the corrected LC-CIDs based on plausible misclassification rates alongside the observed LC-CIDs.


Competing risk events; cause of failure; cause-specific survival; cumulative incidence; death certificate

[Available on 2020-08-01]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center