Send to

Choose Destination
BMC Res Notes. 2019 Oct 21;12(1):674. doi: 10.1186/s13104-019-4726-x.

Identifying incident cancer cases in routinely collected hospital data: a retrospective validation study.

Author information

Cancer Research Division, Cancer Council NSW, PO Box 572, Kings Cross, Sydney, NSW, 1340, Australia.
Cancer Research Division, Cancer Council NSW, PO Box 572, Kings Cross, Sydney, NSW, 1340, Australia.
Sydney School of Public Health, University of Sydney, Sydney, NSW, Australia.
Prince of Wales Clinical School, UNSW Medicine, Sydney, NSW, Australia.
School of Medicine and Public Health, University of Newcastle, Newcastle, NSW, Australia.



Population-level cancer incidence data are critical for epidemiological cancer research, however provision of cancer registry data can be delayed. We previously reported that in a large population-based Australian cohort, registry-based incidence data were well matched by routinely collected hospital diagnosis data (sensitivities and positive predictive values (PPVs) > 80%) for six of the 12 most common cancer types: breast, colorectum, kidney, lung, pancreas and uterus. The available hospital data covered more recent time periods. We have since obtained more recent cancer registry data, allowing us to further test the validity of hospital diagnosis records in identifying incident cases.


The more recent hospital diagnosis data were valid for identifying incident cases for the six cancer types, with sensitivities 81-94% and PPVs 86-96%. However, 2-10% of cases were identified > 3 months after the registry's diagnosis date and detailed clinical cancer information was unavailable. The level of identification was generally higher for cases aged < 80 years, those with known disease stage and cases living in higher socioeconomic areas. The inclusion of death records increased sensitivity for some cancer types, but requires caution due to potential false-positive cases. This study validates the use of hospital diagnosis records for identifying incident cancer cases.


Cancer incidence; Case ascertainment; Record linkage; Routinely collected data; Validation

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center