Display Settings:


Send to:

Choose Destination
See comment in PubMed Commons below
J Clin Epidemiol. 2007 Sep;60(9):883-91. Epub 2007 May 17.

Probabilistic record linkage is a valid and transparent tool to combine databases without a patient identification number.

Author information

  • 1Academic Medical Centrum (AMC), Department of Medical Informatics, Amsterdam, The Netherlands.



To describe the technical approach and subsequent validation of the probabilistic linkage of the three anonymous, population-based Dutch Perinatal Registries (LVR1 of midwives, LVR2 of obstetricians, and LNR of pediatricians/neonatologists). These registries do not share a unique identification number.


A combination of probabilistic and deterministic record linkage techniques were applied using information about the mother, delivery, and child(ren) to link three known registries. Rewards for agreement and penalties for disagreement between corresponding variables were calculated based on the observed patterns of agreement and disagreements using maximum likelihood estimation. Special measures were developed to overcome linking difficulties in twins. A subsample of linked and nonlinked pairs was validated.


Independent validation confirmed that the procedure successfully linked the three Dutch perinatal registries despite nontrivial error rates in the linking variables.


Probabilistic linkage techniques allowed the creation of a high-quality linked database from crude registry data. The developed procedures are generally applicable in linkage of health data with partially identifying information. They provide useful source date even if cohorts are only partly overlapping and if within the cohort, multiple entities and twins exist.

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Icon for Elsevier Science
    Loading ...
    Write to the Help Desk