Format

Send to

Choose Destination
Syst Biol. 2017 Nov 1;66(6):1054-1064. doi: 10.1093/sysbio/syw121.

ProtASR: An Evolutionary Framework for Ancestral Protein Reconstruction with Selection on Folding Stability.

Author information

1
Instituto de Investigação e Inovação em Saúde (i3S), University of Porto, Porto, Portugal.
2
Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal.
3
Centre for Molecular Biology Severo Ochoa (CBMSO), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain.
4
Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain.
5
Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA 19122, USA.
6
Department of Molecular Biology, University of Wyoming, Laramie, WY 82071, USA.

Abstract

The computational reconstruction of ancestral proteins provides information on past biological events and has practical implications for biomedicine and biotechnology. Currently available tools for ancestral sequence reconstruction (ASR) are often based on empirical amino acid substitution models that assume that all sites evolve at the same rate and under the same process. However, this assumption is frequently violated because protein evolution is highly heterogeneous due to different selective constraints among sites. Here, we present ProtASR, a new evolutionary framework to infer ancestral protein sequences accounting for selection on protein stability. First, ProtASR generates site-specific substitution matrices through the structurally constrained mean-field (MF) substitution model, which considers both unfolding and misfolding stability. We previously showed that MF models outperform empirical amino acid substitution models, as well as other structurally constrained substitution models, both in terms of likelihood and correctly inferring amino acid distributions across sites. In the second step, ProtASR adapts a well-established maximum-likelihood (ML) ASR procedure to infer ancestral proteins under MF models. A known bias of ML ASR methods is that they tend to overestimate the stability of ancestral proteins by underestimating the frequency of deleterious mutations. We compared ProtASR under MF to two empirical substitution models (JTT and CAT), reconstructing the ancestral sequences of simulated proteins. ProtASR yields reconstructed proteins with less biased stabilities, which are significantly closer to those of the simulated proteins. Analysis of extant protein families suggests that folding stability evolves through time across protein families, potentially reflecting neutral fluctuation. Some families exhibit a more constant protein folding stability, while others are more variable. ProtASR is freely available from https://github.com/miguelarenas/protasr and includes detailed documentation and ready-to-use examples. It runs in seconds/minutes depending on protein length and alignment size. [Ancestral sequence reconstruction; folding stability; molecular adaptation; phylogenetics; protein evolution; protein structure.].

PMID:
28057858
DOI:
10.1093/sysbio/syw121
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Silverchair Information Systems
Loading ...
Support Center