• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of injprevInjury PreventionVisit this articleVisit this journalSubmit a manuscriptReceive email alertsContact usBMJ
Inj Prev. Jun 2004; 10(3): 186–191.
PMCID: PMC1730090

Practical introduction to record linkage for injury research


The frequency of early fatality and the transient nature of emergency medical care mean that a single database will rarely suffice for population based injury research. Linking records from multiple data sources is therefore a promising method for injury surveillance or trauma system evaluation. The purpose of this article is to review the historical development of record linkage, provide a basic mathematical foundation, discuss some practical issues, and consider some ethical concerns.

Clerical or computer assisted deterministic record linkage methods may suffice for some applications, but probabilistic methods are particularly useful for larger studies. The probabilistic method attempts to simulate human reasoning by comparing each of several elements from the two records. The basic mathematical specifications are derived algebraically from fundamental concepts of probability, although the theory can be extended to include more advanced mathematics.

Probabilistic, deterministic, and clerical techniques may be combined in different ways depending upon the goal of the record linkage project. If a population parameter is being estimated for a purely statistical study, a completely probabilistic approach may be most efficient; for other applications, where the purpose is to make inferences about specific individuals based upon their data contained in two or more files, the need for a high positive predictive value would favor a deterministic method or a probabilistic method with careful clerical review. Whatever techniques are used, researchers must realize that the combination of data sources entails additional ethical obligations beyond the use of each source alone.

Full Text

The Full Text of this article is available as a PDF (227K).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Waien SA. Linking large administrative databases: a method for conducting emergency medical services cohort studies using existing data. Acad Emerg Med. 1997 Nov;4(11):1087–1095. [PubMed]
  • Weiss HB, Dill SM, Garrison HG, Coben JH. The potential of using billing data for emergency department injury surveillance. Acad Emerg Med. 1997 Apr;4(4):282–287. [PubMed]
  • Runge JW. Linking data for injury control research. Ann Emerg Med. 2000 Jun;35(6):613–615. [PubMed]
  • Bell RM, Keesey J, Richards T. The urge to merge: linking vital statistics records and Medicaid claims. Med Care. 1994 Oct;32(10):1004–1018. [PubMed]
  • Roos LL, Jr, Wajda A, Nicol JP. The art and science of record linkage: methods that work with few identifiers. Comput Biol Med. 1986;16(1):45–57. [PubMed]
  • Jamieson E, Roberts J, Browne G. The feasibility and accuracy of anonymized record linkage to estimate shared clientele among three health and social service agencies. Methods Inf Med. 1995 Sep;34(4):371–377. [PubMed]
  • Howe GR. Use of computerized record linkage in cohort studies. Epidemiol Rev. 1998;20(1):112–121. [PubMed]
  • Neutel CI, Johansen HL, Walop W. 'New data from old': epidemiology and record-linkage. Prog Food Nutr Sci. 1991;15(3):85–116. [PubMed]
  • Dunn HL. Record Linkage. Am J Public Health Nations Health. 1946 Dec;36(12):1412–1416. [PMC free article] [PubMed]
  • NEWCOMBE HB, KENNEDY JM, AXFORD SJ, JAMES AP. Automatic linkage of vital records. Science. 1959 Oct 16;130(3381):954–959. [PubMed]
  • Smith ME, Newcombe HB. Accuracies of computer versus manual linkages of routine health records. Methods Inf Med. 1979 Apr;18(2):89–97. [PubMed]
  • Gill L, Goldacre M, Simmons H, Bettley G, Griffith M. Computerised linking of medical records: methodological guidelines. J Epidemiol Community Health. 1993 Aug;47(4):316–319. [PMC free article] [PubMed]
  • Howe GR, Lindsay J. A generalized iterative record linkage computer system for use in medical follow-up studies. Comput Biomed Res. 1981 Aug;14(4):327–340. [PubMed]
  • Arellano MG, Petersen GR, Petitti DB, Smith RE. The California Automated Mortality Linkage System (CAMLIS). Am J Public Health. 1984 Dec;74(12):1324–1330. [PMC free article] [PubMed]
  • Wajda A, Roos LL, Layefsky M, Singleton JA. Record linkage strategies: Part II. Portable software and deterministic matching. Methods Inf Med. 1991 Aug;30(3):210–214. [PubMed]
  • Roos LL, Walld R, Wajda A, Bond R, Hartford K. Record linkage strategies, outpatient procedures, and administrative data. Med Care. 1996 Jun;34(6):570–582. [PubMed]
  • Langley JD, Botha JL. Use of record linkage techniques to maintain the Leicestershire Diabetes Register. Comput Methods Programs Biomed. 1994 Jan;41(3-4):287–295. [PubMed]
  • Jaro MA. Probabilistic linkage of large public health data files. Stat Med. 14(5-7):491–498. [PubMed]
  • Clark DE, Katz MS, Campbell SM. Decreasing mortality and morbidity rates after the institution of a statewide burn program. J Burn Care Rehabil. 1992 Mar-Apr;13(2 Pt 1):261–270. [PubMed]
  • Copes WS, Stark MM, Lawnick MM, Tepper S, Wilkerson D, DeJong G, Brannon R, Hamilton BB. Linking data from national trauma and rehabilitation registries. J Trauma. 1996 Mar;40(3):428–436. [PubMed]
  • Esposito TJ, Nania J, Maier RV. State trauma system evaluation: a unique and comprehensive approach. Ann Emerg Med. 1992 Apr;21(4):351–357. [PubMed]
  • Russell J, Conroy C. Representativeness of deaths identified through the injury-at-work item on the death certificate: implications for surveillance. Am J Public Health. 1991 Dec;81(12):1613–1618. [PMC free article] [PubMed]
  • Fife D. Matching fatal accident reporting system cases with National Center for Health Statistics motor vehicle deaths. Accid Anal Prev. 1989 Feb;21(1):79–83. [PubMed]
  • Clark DE. Development of a statewide trauma registry using multiple linked sources of data. Proc Annu Symp Comput Appl Med Care. 1993:654–658. [PMC free article] [PubMed]
  • Ferrante AM, Rosman DL, Knuiman MW. The construction of a road injury database. Accid Anal Prev. 1993 Dec;25(6):659–665. [PubMed]
  • Lopez DG, Rosman DL, Jelinek GA, Wilkes GJ, Sprivulis PC. Complementing police road-crash records with trauma registry data--an initial evaluation. Accid Anal Prev. 2000 Nov;32(6):771–777. [PubMed]
  • Rosman DL. The western australian road injury database (1987-1996): ten years of linked police, hospital and death records of road crashes and injuries. Accid Anal Prev. 2001 Jan;33(1):81–88. [PubMed]
  • Clark DE, Hahn DR. Comparison of probabilistic and deterministic record linkage in the development of a statewide trauma registry. Proc Annu Symp Comput Appl Med Care. 1995:397–401. [PMC free article] [PubMed]
  • Alsop JC, Langley JD. Determining first admissions in a hospital discharge file via record linkage. Methods Inf Med. 1998 Jan;37(1):32–37. [PubMed]
  • Van Tuinen M. Unsafe driving behaviors and hospitalization. Mo Med. 1994 Apr;91(4):172–175. [PubMed]
  • Farrell TM, Sutton JE, Clark DE, Horner WR, Morris KI, Finison KS, Menchen GE, Cohn KH. Moose-motor vehicle collisions. An increasing hazard in northern New England. Arch Surg. 1996 Apr;131(4):377–381. [PubMed]
  • Karlson TA, Quade C, Florey M. Nonfatal motor vehicle crash injuries: Wisconsin's experience with linked data systems. Wis Med J. 1996 May;95(5):301–304. [PubMed]
  • Patterson L, Weiss H, Schano P. Combining multiple data bases for outcomes assessment. Am J Med Qual. 1996 Spring;11(1):S73–S77. [PubMed]
  • Moore M. Comparison of young and adult driver crashes in Alaska using linked traffic crash and hospital data. Alaska Med. 1997 Oct-Dec;39(4):95–102. [PubMed]
  • Cook LJ, Knight S, Olson LM, Nechodom PJ, Dean JM. Motor vehicle crash characteristics and medical outcomes among older drivers in Utah, 1992-1995. Ann Emerg Med. 2000 Jun;35(6):585–591. [PubMed]
  • Clark DE, Hahn DR. Hospital trauma registries linked with population-based data. J Trauma. 1999 Sep;47(3):448–454. [PubMed]
  • Dean JM, Vernon DD, Cook L, Nechodom P, Reading J, Suruda A. Probabilistic linkage of computerized ambulance and inpatient hospital discharge records: a potential tool for evaluation of emergency medical services. Ann Emerg Med. 2001 Jun;37(6):616–626. [PubMed]
  • Fair ME, Lalonde P, Newcombe HB. Application of exact ODDS for partial agreements of names in record linkage. Comput Biomed Res. 1991 Feb;24(1):58–71. [PubMed]
  • Brenner H, Schmidtmann I. Determinants of homonym and synonym rates of record linkage in disease registration. Methods Inf Med. 1996 Mar;35(1):19–24. [PubMed]
  • Brenner H, Schmidtmann I, Stegmaier C. Effects of record linkage errors on registry-based follow-up studies. Stat Med. 1997 Dec 15;16(23):2633–2643. [PubMed]
  • Blakely Tony, Salmond Clare. Probabilistic record linkage and a method to calculate the positive predictive value. Int J Epidemiol. 2002 Dec;31(6):1246–1252. [PubMed]
  • Copas JB, Hilton FJ. Record linkage: statistical models for matching computer records. J R Stat Soc Ser A Stat Soc. 1990;153(3):287–320. [PubMed]
  • Newcombe HB. Age-related bias in probabilistic death searches due to neglect of the "prior likelihoods". Comput Biomed Res. 1995 Apr;28(2):87–99. [PubMed]
  • Newcombe HB. Strategy and art in automated death searches. Am J Public Health. 1984 Dec;74(12):1302–1303. [PMC free article] [PubMed]
  • Kelman CW, Bass AJ, Holman CDJ. Research use of linked health data--a best practice protocol. Aust N Z J Public Health. 2002;26(3):251–255. [PubMed]
  • Brenner H, Schmidtmann I. Effects of record linkage errors on disease registration. Methods Inf Med. 1998 Jan;37(1):69–74. [PubMed]
  • Muse AG, Mikl J, Smith PF. Evaluating the quality of anonymous record linkage using deterministic procedures with the New York State AIDS registry and a hospital discharge file. Stat Med. 14(5-7):499–509. [PubMed]
  • Gomatam Shanti, Carter Randy, Ariet Mario, Mitchell Glenn. An empirical comparison of record linkage procedures. Stat Med. 2002 May 30;21(10):1485–1496. [PubMed]
  • Beebe GW. Record linkage systems--Canada vs the United States. Am J Public Health. 1980 Dec;70(12):1246–1248. [PMC free article] [PubMed]
  • Breen KJ. Consent for the linkage of data for public health research: is it (or should it be) an absolute pre-requisite? Aust N Z J Public Health. 2001 Oct;25(5):423–425. [PubMed]
  • Annas George J. Medical privacy and medical research--judging the new federal regulations. N Engl J Med. 2002 Jan 17;346(3):216–220. [PubMed]
  • Califf Robert M, Muhlbaier Lawrence H. Health Insurance Portability and Accountability Act (HIPAA): must there be a trade-off between privacy and quality of health care, or can we advance both? Circulation. 2003 Aug 26;108(8):915–918. [PubMed]
  • Roos LL, Wajda A. Record linkage strategies. Part I: Estimating information and evaluating approaches. Methods Inf Med. 1991 Apr;30(2):117–123. [PubMed]
  • Cook LJ, Olson LM, Dean JM. Probabilistic record linkage: relationships between file sizes, identifiers and match weights. Methods Inf Med. 2001 Jul;40(3):196–203. [PubMed]

Articles from Injury Prevention are provided here courtesy of BMJ Group


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...