Format

Send to

Choose Destination
Sci Data. 2018 Nov 20;5:180258. doi: 10.1038/sdata.2018.258.

A reference set of curated biomedical data and metadata from clinical case reports.

Caufield JH1,2, Zhou Y1,2,3, Garlid AO1,2, Setty SP4, Liem DA1,2,5, Cao Q1,2, Lee JM1,2, Murali S1,2, Spendlove S1,2, Wang W1,6,7,8, Zhang L3, Sun Y1,7, Bui A1,6,9, Hermjakob H1,10, Watson KE1,5, Ping P1,2,5,6,8.

Author information

1
The NIH BD2K Center of Excellence in Biomedical Computing, University of California at Los Angeles, Los Angeles, CA 90095, USA.
2
Department of Physiology, University of California at Los Angeles, Los Angeles, CA 90095, USA.
3
Department of Cardiology, First Affiliated Hospital, Zhejiang University School of Medicine, 310003, Hangzhou, Zhejiang, P.R. China.
4
Department of Pediatric and Adult Congenital Cardiac Surgery, Miller Children's and Women's Hospital and Long Beach Memorial Hospital, Long Beach, CA 90806, USA.
5
Department of Medicine/Cardiology, University of California at Los Angeles, Los Angeles, CA 90095, USA.
6
Department of Bioinformatics, University of California at Los Angeles, Los Angeles, CA 90095, USA.
7
Department of Computer Science, University of California at Los Angeles, Los Angeles, CA 90095, USA.
8
Scalable Analytics Institute (ScAi), University of California at Los Angeles, Los Angeles, CA 90095, USA.
9
Department of Radiological Sciences, University of California at Los Angeles, Los Angeles, CA 90095, USA.
10
Molecular Systems Cluster, European Molecular Biology Laboratory-European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK.

Abstract

Clinical case reports (CCRs) provide an important means of sharing clinical experiences about atypical disease phenotypes and new therapies. However, published case reports contain largely unstructured and heterogeneous clinical data, posing a challenge to mining relevant information. Current indexing approaches generally concern document-level features and have not been specifically designed for CCRs. To address this disparity, we developed a standardized metadata template and identified text corresponding to medical concepts within 3,100 curated CCRs spanning 15 disease groups and more than 750 reports of rare diseases. We also prepared a subset of metadata on reports on selected mitochondrial diseases and assigned ICD-10 diagnostic codes to each. The resulting resource, Metadata Acquired from Clinical Case Reports (MACCRs), contains text associated with high-level clinical concepts, including demographics, disease presentation, treatments, and outcomes for each report. Our template and MACCR set render CCRs more findable, accessible, interoperable, and reusable (FAIR) while serving as valuable resources for key user groups, including researchers, physician investigators, clinicians, data scientists, and those shaping government policies for clinical trials.

PMID:
30457569
PMCID:
PMC6244181
DOI:
10.1038/sdata.2018.258
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center