Format

Send to

Choose Destination
J Vis Exp. 2018 Sep 20;(139). doi: 10.3791/58392.

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts.

Author information

1
The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles; jcaufield@mednet.ucla.edu.
2
The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles; Department of Medicine/Cardiology, University of California, Los Angeles.
3
The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles.
4
Department of Cardiology, First Affiliated Hospital, Zhejiang University School of Medicine.
5
The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Medicine/Cardiology, University of California, Los Angeles.
6
The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Radiological Sciences, University of California, Los Angeles; Department of Bioengineering, University of California, Los Angeles; Scalable Analytics Institute (ScAi), University of California, Los Angeles.
7
The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Scalable Analytics Institute (ScAi), University of California, Los Angeles; Department of Bioinformatics, University of California, Los Angeles; Department of Computer Science, University of California, Los Angeles.
8
The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles; Department of Medicine/Cardiology, University of California, Los Angeles; Scalable Analytics Institute (ScAi), University of California, Los Angeles; Department of Bioinformatics, University of California, Los Angeles.

Abstract

Clinical case reports (CCRs) are a valuable means of sharing observations and insights in medicine. The form of these documents varies, and their content includes descriptions of numerous, novel disease presentations and treatments. Thus far, the text data within CCRs is largely unstructured, requiring significant human and computational effort to render these data useful for in-depth analysis. In this protocol, we describe methods for identifying metadata corresponding to specific biomedical concepts frequently observed within CCRs. We provide a metadata template as a guide for document annotation, recognizing that imposing structure on CCRs may be pursued by combinations of manual and automated effort. The approach presented here is appropriate for organization of concept-related text from a large literature corpus (e.g., thousands of CCRs) but may be easily adapted to facilitate more focused tasks or small sets of reports. The resulting structured text data includes sufficient semantic context to support a variety of subsequent text analysis workflows: meta-analyses to determine how to maximize CCR detail, epidemiological studies of rare diseases, and the development of models of medical language may all be made more realizable and manageable through the use of structured text data.

PMID:
30295669
PMCID:
PMC6235242
DOI:
10.3791/58392
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for MyJove Corporation Icon for PubMed Central
Loading ...
Support Center