Send to

Choose Destination
Sci Data. 2018 Jan 30;5:180001. doi: 10.1038/sdata.2018.1.

A dataset of 200 structured product labels annotated for adverse drug reactions.

Author information

U.S. National Library of Medicine, NIH, 8600 Rockville Pike, Bethesda, MD 20894, USA.
UT Health School of Biomedical Informatics, 7000 Fannin St., Houston, TX 77030, USA.
Office of New Drugs, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, 10001 New Hampshire Ave, Silver Spring, MD 20903, USA.


Adverse drug reactions (ADRs), unintended and sometimes dangerous effects that a drug may have, are one of the leading causes of morbidity and mortality during medical care. To date, there is no structured machine-readable authoritative source of known ADRs. The United States Food and Drug Administration (FDA) partnered with the National Library of Medicine to create a pilot dataset containing standardised information about known adverse reactions for 200 FDA-approved drugs. The Structured Product Labels (SPLs), the documents FDA uses to exchange information about drugs and other products, were manually annotated for adverse reactions at the mention level to facilitate development and evaluation of text mining tools for extraction of ADRs from all SPLs. The ADRs were then normalised to the Unified Medical Language System (UMLS) and to the Medical Dictionary for Regulatory Activities (MedDRA). We present the curation process and the structure of the publicly available database SPL-ADR-200db containing 5,098 distinct ADRs. The database is available at; the code for preparing and validating the data is available at

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Nature Publishing Group Icon for PubMed Central
Loading ...
Support Center