Format

Send to

Choose Destination
PLoS One. 2019 Jun 17;14(6):e0215476. doi: 10.1371/journal.pone.0215476. eCollection 2019.

Evaluating the predictability of medical conditions from social media posts.

Author information

1
Penn Medicine Center for Digital Health, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
2
Department of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
3
Penn Medicine Center for Health Care Innovation, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
4
The Center for Health Equity Research and Promotion-Philadelphia Veterans Affairs Medical Center, Philadelphia, Pennsylvania, United States of America.
5
The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
6
Positive Psychology Center, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
7
Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
8
Microsoft Research, New York, New York, United States of America.
9
Department of Computer Science, Stony Brook University, Stony Brook, New York, United States of America.

Abstract

We studied whether medical conditions across 21 broad categories were predictable from social media content across approximately 20 million words written by 999 consenting patients. Facebook language significantly improved upon the prediction accuracy of demographic variables for 18 of the 21 disease categories; it was particularly effective at predicting diabetes and mental health conditions including anxiety, depression and psychoses. Social media data are a quantifiable link into the otherwise elusive daily lives of patients, providing an avenue for study and assessment of behavioral and environmental disease risk factors. Analogous to the genome, social media data linked to medical diagnoses can be banked with patients' consent, and an encoding of social media language can be used as markers of disease risk, serve as a screening tool, and elucidate disease epidemiology. In what we believe to be the first report linking electronic medical record data with social media data from consenting patients, we identified that patients' Facebook status updates can predict many health conditions, suggesting opportunities to use social media data to determine disease onset or exacerbation and to conduct social media-based health interventions.

Conflict of interest statement

Regarding the commercial affiliation of author SH with Microsoft: This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Supplemental Content

Full text links

Icon for Public Library of Science Icon for PubMed Central
Loading ...
Support Center