Format

Send to

Choose Destination
J Biomed Inform. 2019 Dec 19;102:103363. doi: 10.1016/j.jbi.2019.103363. [Epub ahead of print]

Adapting electronic health records-derived phenotypes to claims data: Lessons learned in using limited clinical data for phenotyping.

Author information

1
Columbia University Medical Center, New York, NY, USA; Observational Health Data Sciences and Informatics (OHDSI), New York, NY, USA.
2
IQVIA, Cambridge, MA, USA; Observational Health Data Sciences and Informatics (OHDSI), New York, NY, USA.
3
Janssen Research & Development, Raritan, NJ, USA; Observational Health Data Sciences and Informatics (OHDSI), New York, NY, USA.
4
Columbia University Medical Center, New York, NY, USA; Observational Health Data Sciences and Informatics (OHDSI), New York, NY, USA. Electronic address: hripicsak@columbia.edu.
5
Columbia University Medical Center, New York, NY, USA; Observational Health Data Sciences and Informatics (OHDSI), New York, NY, USA. Electronic address: chunhua@columbia.edu.

Abstract

Algorithms for identifying patients of interest from observational data must address missing and inaccurate data and are desired to achieve comparable performance on both administrative claims and electronic health records data. However, administrative claims data do not contain the necessary information to develop accurate algorithms for disorders that require laboratory results, and this omission can result in insensitive diagnostic code-based algorithms. In this paper, we tested our assertion that the performance of a diagnosis code-based algorithm for chronic kidney disorder (CKD) can be improved by adding other codes indirectly related to CKD (e.g., codes for dialysis, kidney transplant, suspicious kidney disorders). Following the best practices from Observational Health Data Sciences and Informatics (OHDSI), we adapted an electronic health record-based gold standard algorithm for CKD and then created algorithms that can be executed on administrative claims data and account for related data quality issues. We externally validated our algorithms on four electronic health record datasets in the OHDSI network. Compared to the algorithm that uses CKD diagnostic codes only, positive predictive value of the algorithms that use additional codes was slightly increased (47.4% vs. 47.9-48.5% respectively). The algorithms adapted from the gold standard algorithm can be used to infer chronic kidney disorder based on administrative claims data. We succeeded in improving the generalizability and consistency of the CKD phenotypes by using data and vocabulary standardized across the OHDSI network, although performance variability across datasets remains. We showed that identifying and addressing coding and data heterogeneity can improve the performance of the algorithms.

KEYWORDS:

Chronic kidney disorder; Data quality; Observational Health Data Sciences and Informatics (OHDSI); Phenotyping; Portability; Reproducibility

PMID:
31866433
DOI:
10.1016/j.jbi.2019.103363

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center