Format

Send to

Choose Destination
BMC Bioinformatics. 2015 Jul 25;16:228. doi: 10.1186/s12859-015-0673-2.

GESPA: classifying nsSNPs to predict disease association.

Author information

1
Department of Urology, SUNY Upstate Medical University, Syracuse, NY, USA. jaykk128@yahoo.com.
2
Department of Obstetrics and Gynecology, University of Rochester, Rochester, NY, USA. jay_reeder@urmc.rochester.edu.
3
Department of Pathology, SUNY Upstate Medical University, Syracuse, NY, USA. shrimpta@upstate.edu.
4
Department of Microbiology and Immunology, University of Rochester, Rochester, NY, USA. Juilee_Thakar@urmc.rochester.edu.
5
Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY, USA. Juilee_Thakar@urmc.rochester.edu.

Abstract

BACKGROUND:

Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs.

RESULTS:

GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA's overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data.

CONCLUSIONS:

GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa.

PMID:
26206375
PMCID:
PMC4513380
DOI:
10.1186/s12859-015-0673-2
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center