Send to

Choose Destination
J Healthc Inf Manag. 1998 Fall;12(3):43-52.

Issues in identification and linkage of patient records across an integrated delivery system.

Author information

Advanced Linkage Technologies of America, Inc., USA.


Historically, the health information systems community has viewed linking personal records as a mundane task. The oversimplified view that routine database manipulation can accurately identify multiple records for a single individual is erroneous, an assumption based on a misperception of the quality of the underlying data. Such data have been adversely affected by the evolution of individual facility patient indexes from multiple systems and the results of backload procedures, and the lack of focus on the need for data integrity by users of the automated systems. Much of the random, invalid data we identify on a daily basis is directly associated with the need for system users to place data in the patient record while they face the situation of having no obvious data field in which to place them. Combined with an underlying lack of standards for the collection of personal identification information, this results in pure chaos when reviewing an MPI file containing a million records at the start of a linkage evaluation project. We have documented the considerable effort that must therefore be made in standardizing the MPI files using stringent analytical procedures and applying common edit routines before commencing record linkage. This preprocessing effort must then be supplemented with sophisticated matching procedures that can handle the dual challenge of minimizing false negatives (the failure to identify true linkages) and false positives (the incorrect linking of records that do not represent the same person). The identification of pairs of linked records does not, however, complete an EPI loading. Because it is fairly common for a multiple facility linkage evaluation to identify more than two medical record numbers for the same patient, and the primary goal of an EPI is to assign a unique identifier for the patient which will link that patient's multiple files, it becomes necessary to develop a means of readily associating three or more records for the same patient. One approach we have used with great success is to assign a common, sequential identification number to all linked medical record numbers for the same patient regardless of facility. The assignment of linkage identification numbers is computer-intensive and is generally accomplished with a highly iterative process. Both system memory and hard disk resources are fully tested as the number of good linkages in an overlap evaluation reaches the half-million mark or greater. Because the primary linkage analysis goal is to develop linkages on pairs of records, with confidence levels based on the comparison of information for those two records, thresholds must be set to decide which linkages should be accepted as true without any human evaluation. If the threshold is set too low, the defined linkage groups may incorrectly join the medical record numbers for different persons. But if the threshold is set too high, there will be undesired duplication of persons in the enterprise system. As in the identification of the underlying linkage pairs, the development of a confidence measure greatly facilitates the assignment of the unique identification numbers needed in the EPI implementation.

[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center