Format

Send to

Choose Destination
Bioinformatics. 2018 Apr 15;34(8):1278-1286. doi: 10.1093/bioinformatics/btx779.

Identifying functionally informative evolutionary sequence profiles.

Author information

1
Department of Systems & Computational Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA.

Abstract

Motivation:

Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually.

Results:

We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git.

Contact:

andras.fiser@einstein.yu.edu.

Supplementary information:

Supplementary data are available at Bioinformatics online.

PMID:
29211823
PMCID:
PMC5905606
[Available on 2019-04-15]
DOI:
10.1093/bioinformatics/btx779

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center