Format

Send to

Choose Destination
J Biomed Inform. 2019 Jun;94:103205. doi: 10.1016/j.jbi.2019.103205. Epub 2019 May 11.

A distant supervision based approach to medical persona classification.

Author information

1
Information Retrieval and Extraction Lab, Kohli Center for Intelligent Systems, International Institute of Information Technology Hyderabad, 500032, India. Electronic address: nikhil.pattisapu@research.iiit.ac.in.
2
Information Retrieval and Extraction Lab, Kohli Center for Intelligent Systems, International Institute of Information Technology Hyderabad, 500032, India. Electronic address: manish.gupta@iiit.ac.in.
3
Information Retrieval and Extraction Lab, Kohli Center for Intelligent Systems, International Institute of Information Technology Hyderabad, 500032, India. Electronic address: pk.guru@iiit.ac.in.
4
Information Retrieval and Extraction Lab, Kohli Center for Intelligent Systems, International Institute of Information Technology Hyderabad, 500032, India. Electronic address: vv@iiit.ac.in.

Abstract

Identifying medical persona from a social media post is critical for drug marketing, pharmacovigilance and patient recruitment. Medical persona classification aims to computationally model the medical persona associated with a social media post. We present a novel deep learning model for this task which consists of two parts: Convolutional Neural Networks (CNNs), which extract highly relevant features from the sentences of a social media post and average pooling, which aggregates the sentence embeddings to obtain task-specific document embedding. We compare our approach against standard baselines, such as Term Frequency - Inverse Document Frequency (TF-IDF), averaged word embedding based methods and popular neural architectures, such as CNN-Long Short Term Memory (CNN-LSTM) and Hierarchical Attention Networks (HANs). Our model achieves an improvement of 19.7% for classification accuracy and 20.1% for micro F1 measure over the current state-of-the-art. We eliminate the need for manual labeling by employing a distant supervision based method to obtain labeled examples for training the models. We thoroughly analyze our model to discover cues that are indicative of a particular persona. Particularly, we use first derivative saliency to identify the salient words in a particular social media post.

KEYWORDS:

Convolutional neural network; Deep learning; Distant supervision; Hierarchical attention network; Long short term memory network; Medical personae; Medical social media; Persona

PMID:
31085324
DOI:
10.1016/j.jbi.2019.103205

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center