Send to

Choose Destination
Bioinformatics. 2018 May 1;34(9):1547-1554. doi: 10.1093/bioinformatics/btx815.

GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text.

Author information

National Science Foundation Center for Big Learning, University of Florida, Gainesville, FL 32611, USA.
Department of Computer & Information Science & Engineering, University of Florida, Gainesville, FL 32611, USA.
Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611, USA.
Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL 32611, USA.
Genomics of Gene Expression Laboratory, Centro de Investigación Príncipe Felipe, Valencia 42012, Spain.



Best performing named entity recognition (NER) methods for biomedical literature are based on hand-crafted features or task-specific rules, which are costly to produce and difficult to generalize to other corpora. End-to-end neural networks achieve state-of-the-art performance without hand-crafted features and task-specific knowledge in non-biomedical NER tasks. However, in the biomedical domain, using the same architecture does not yield competitive performance compared with conventional machine learning models.


We propose a novel end-to-end deep learning approach for biomedical NER tasks that leverages the local contexts based on n-gram character and word embeddings via Convolutional Neural Network (CNN). We call this approach GRAM-CNN. To automatically label a word, this method uses the local information around a word. Therefore, the GRAM-CNN method does not require any specific knowledge or feature engineering and can be theoretically applied to a wide range of existing NER problems. The GRAM-CNN approach was evaluated on three well-known biomedical datasets containing different BioNER entities. It obtained an F1-score of 87.26% on the Biocreative II dataset, 87.26% on the NCBI dataset and 72.57% on the JNLPBA dataset. Those results put GRAM-CNN in the lead of the biological NER methods. To the best of our knowledge, we are the first to apply CNN based structures to BioNER problems.

Availability and implementation:

The GRAM-CNN source code, datasets and pre-trained model are available online at:

Contact: or

Supplementary information:

Supplementary data are available at Bioinformatics online.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center