Format

Send to

Choose Destination
J Biomed Inform. 2016 Dec;64:288-295. doi: 10.1016/j.jbi.2016.10.015. Epub 2016 Oct 31.

Evaluation of relational and NoSQL database architectures to manage genomic annotations.

Author information

1
Yale University, Department of Laboratory Medicine, New Haven, CT, United States. Electronic address: wade.schulz@yale.edu.
2
University of Minnesota, Department of Psychiatry, Minneapolis, MN, United States.
3
Agilevent, Scottsdale, AZ, United States.
4
Yale University, Department of Laboratory Medicine, New Haven, CT, United States.

Abstract

While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences.

KEYWORDS:

Genomics; MongoDB; MySQL; NoSQL; PostgreSQL; Relational database

PMID:
27810480
DOI:
10.1016/j.jbi.2016.10.015
[Indexed for MEDLINE]
Free full text

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center