Send to

Choose Destination
J Public Health (Oxf). 2014 Dec;36(4):684-92. doi: 10.1093/pubmed/fdt116. Epub 2013 Dec 8.

Completeness and usability of ethnicity data in UK-based primary care and hospital databases.

Author information

Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK.
NHLI Division, Faculty of Medicine, International Centre for Circulatory Health, London W2 1LA, UK.
Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK Utrecht Institute for Pharmaceutical Sciences, Utrecht University, Utrecht TB 3508, The Netherlands Medicines and Healthcare Products Regulatory Agency, LondonSW1W 9SZ, UK.
Department of Social Policy, London School of Economics and Political Science, London WC2A 2AE, UK.



Ethnicity recording across the National Health Service (NHS) has improved dramatically over the past decade. This study profiles the completeness, consistency and representativeness of routinely collected ethnicity data in both primary care and hospital settings.


Completeness and consistency of ethnicity recording was examined in the Clinical Practice Research Datalink (CPRD) and Hospital Episode Statistics (HES), and the ethnic breakdown of the CPRD was compared with that of the 2011 UK censuses.


27.1% of all patients in the CPRD (1990-2012) have ethnicity recorded. This proportion rises to 78.3% for patients registered since April 2006. The ethnic breakdown of the CPRD is comparable to the UK censuses. 79.4% of HES inpatients, 46.8% of outpatients and 26.8% of A&E patients had their ethnicity recorded. Amongst those with ethnicity recorded on >1 occasion, consistency was over 90% in all data sets except for HES inpatients. Combining CPRD and HES increased completeness to 97%, with 85% of patients having the same ethnicity recorded in both databases.


Using CPRD ethnicity from 2006 onwards maximizes completeness and comparability with the UK population. High concordance within and across NHS sources suggests these data are of high value when examining the continuum of care. Poor completeness and consistency of A&E and outpatient data render these sources unreliable.


epidemiology; ethnicity; methods

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center