Format

Send to

Choose Destination
See comment in PubMed Commons below
Int J Med Inform. 2016 Jun;90:40-7. doi: 10.1016/j.ijmedinf.2016.03.006. Epub 2016 Mar 24.

Data quality assessment framework to assess electronic medical record data for use in research.

Author information

1
Frances Payne Bolton School of Nursing, Case Western Reserve University, 10900 Euclid Ave, Cleveland, OH 44106, United States; Cleveland Clinic, 10900 Euclid Avenue, Cleveland, OH 44195, United States. Electronic address: axr62@cwru.edu.
2
Cleveland Clinic, 10900 Euclid Avenue, Cleveland, OH 44195, United States. Electronic address: milinoa@ccf.org.
3
Frances Payne Bolton School of Nursing, Case Western Reserve University, 10900 Euclid Ave, Cleveland, OH 44106, United States. Electronic address: elizabeth.madigan@case.edu.

Abstract

INTRODUCTION:

The proliferation and use of electronic medical records (EMR) in the clinical setting now provide a rich source of clinical data that can be leveraged to support research on patient outcomes, comparative effectiveness, and health systems research. Once the large volume and variety of data that robust clinical EMRs provide is aggregated, the suitability of the data for research purposes must be addressed. Therefore, the purpose of this paper is two-fold. First, we present a stepwise framework capable of guiding initial data quality assessment when matching multiple data sources regardless of context or application. Then, we demonstrate a use case of initial analysis of a longitudinal data repository of electronic health record data that illustrates the first four steps of the framework, and report results.

METHODS:

A six-step data quality assessment framework is proposed and described that includes the following data quality assessment steps: (1) preliminary analysis, (2) documentation-longitudinal concordance, (3) breadth, (4) data element presence, (5) density, and (6) prediction. The six-step framework was applied to the Transport Data Mart-a data repository that contains over 28,000 records for patients that underwent interhospital transfer that includes EMRs from the sending hospitalization, transport, and receiving hospitalization.

RESULTS:

There were a total of 9557 log entries of which 8139 were successfully matched to corresponding hospital encounters. 2832 were successfully mapped to both the sending and receiving hospital encounters (resulting in a 93% automatic matching rate), with 590 including air medical transport EMR data representing a complete case for testing. Results from Step 2 indicate that once records are identified and matched, there appears to be relatively limited drop-off of additional records when the criteria for matching increases, indicating the a proportion of records consistently contain nearly complete data. Measures of central tendency used in Step 3 and 4 exhibit a right skewness suggesting that a small proportion of records contain the highest number of repeated measures for the measured variables.

CONCLUSIONS:

The proposed six-step data quality assessment framework is useful in establishing the metadata for a longitudinal data repository that can be replicated by other studies. There are practical issues that need to be addressed including the data quality assessments-with the most prescient being the need to establish data quality metrics for benchmarking acceptable levels of EMR data inclusiveness through testing and application.

KEYWORDS:

Electronic medical records; Evaluation & assessment; Information storage; Retrieval & integration

PMID:
27103196
PMCID:
PMC4845906
DOI:
10.1016/j.ijmedinf.2016.03.006
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science Icon for PubMed Central
    Loading ...
    Support Center