Send to

Choose Destination
Data Brief. 2019 Mar 15;24:103838. doi: 10.1016/j.dib.2019.103838. eCollection 2019 Jun.

The PsyTAR dataset: From patients generated narratives to a corpus of adverse drug events and effectiveness of psychiatric medications.

Author information

Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States.
Department of Health Informatics & Administration, University of Wisconsin Milwaukee, Milwaukee, WI, United States.
Department of Health Sciences Research, Mayo Clinic, Rochester, MN, United States.
Department of Health Policy and Management, Johns Hopkins University, Baltimore, MD, United States.
Department of Biomedical and Health Information Sciences, University of Illinois at Chicago, Chicago, IL, United States.
School of Pharmacy, University of Pittsburgh, Pittsburgh, PA, United States.
School of Information, University of South Florida, Tampa, FL, United States.
Department of Biomedical Informatics, Utah University, Salt Lake City, UT, United States.
Emmes Corporation, Rockville, MD, United States.
Department of Epidemiology, Johns Hopkins University, Baltimore, MD, United States.
College of Letters and Science, University of Wisconsin Milwaukee, WI, United States.


The "Psychiatric Treatment Adverse Reactions" (PsyTAR) dataset contains patients' expression of effectiveness and adverse drug events associated with psychiatric medications. The PsyTAR was generated in four phases. In the first phase, a sample of 891 drugs reviews posted by patients on an online healthcare forum, "", was collected for four psychiatric drugs: Zoloft, Lexapro, Cymbalta, and Effexor XR. For each drug review, patient demographic information, duration of treatment, and satisfaction with the drugs were reported. In the second phase, sentence classification, drug reviews were split to 6009 sentences, and each sentence was labeled for the presence of Adverse Drug Reaction (ADR), Withdrawal Symptoms (WDs), Sign/Symptoms/Illness (SSIs), Drug Indications (DIs), Drug Effectiveness (EF), Drug Infectiveness (INF), and Others (not applicable). In the third phases, entities including ADRs (4813 mentions), WDs (590 mentions), SSIs (1219 mentions), and DIs (792 mentions) were identified and extracted from the sentences. In the four phases, all the identified entities were mapped to the corresponding UMLS Metathesaurus concepts (916) and SNOMED CT concepts (755). In this phase, qualifiers representing severity and persistency of ADRs, WDs, SSIs, and DIs (e.g., mild, short term) were identified. All sentences and identified entities were linked to the original post using IDs (e.g., Zoloft.1, Effexor.29, Cymbalta.31). The PsyTAR dataset can be accessed via Online Supplement #1 under the CC BY 4.0 Data license. The updated versions of the dataset would also be accessible in

Supplemental Content

Full text links

Icon for Elsevier Science Icon for PubMed Central
Loading ...
Support Center