Reflections on modern methods: linkage error bias

Int J Epidemiol. 2019 Dec 1;48(6):2050-2060. doi: 10.1093/ije/dyz203.

Abstract

Linked data are increasingly being used for epidemiological research, to enhance primary research, and in planning, monitoring and evaluating public policy and services. Linkage error (missed links between records that relate to the same person or false links between unrelated records) can manifest in many ways: as missing data, measurement error and misclassification, unrepresentative sampling, or as a special combination of these that is specific to analysis of linked data: the merging and splitting of people that can occur when two hospital admission records are counted as one person admitted twice if linked and two people admitted once if not. Through these mechanisms, linkage error can ultimately lead to information bias and selection bias; so identifying relevant mechanisms is key in quantitative bias analysis. In this article we introduce five key concepts and a study classification system for identifying which mechanisms are relevant to any given analysis. We provide examples and discuss options for estimating parameters for bias analysis. This conceptual framework provides the 'links' between linkage error, information bias and selection bias, and lays the groundwork for quantitative bias analysis for linkage error.

Keywords: Linkage error; bias; bias analysis; data linkage; information bias; missing data; quantitative bias analysis; record linkage; selection bias; sensitivity analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Accuracy
  • Hospitalization / statistics & numerical data
  • Humans
  • Medical Record Linkage / methods*
  • Selection Bias
  • Semantic Web / statistics & numerical data*