A sliding window-based method to detect selective constraints in protein-coding genes and its application to RNA viruses

J Mol Evol. 2002 Nov;55(5):509-21. doi: 10.1007/s00239-002-2346-9.

Abstract

Here we present a new sliding window-based method specially designed to detect selective constraints in specific regions of a multiple protein-coding sequence alignment. In contrast to previous window-based procedures, our method is based on a nonarbitrary statistical approach to find the appropriate codon-window size to test deviations of synonymous (d(S)) and nonsynonymous (d(N)) nucleotide substitutions from the expectation. The probabilities of d(N) and d(S) are obtained from simulated data and used to detect significant deviations of d(N) and d(S) in a specific window region of the real sequence alignment. The nonsynonymous-to-synonymous rate ratio (w = d(N)/d(S)) was used to highlight selective constraints in any window wherein d(S) or d(N) was significantly different from the expectation. In these significant windows, w and its variance [V(w)] were calculated and used to test the neutral hypothesis. Computer simulations showed that the method is accurate even for highly divergent sequences. The main advantages of the new method are that it (i) uses a statistically appropriate window size to detect different selective patterns, (ii) is computationally less intensive than maximum likelihood methods, and (iii) detects saturation of synonymous sites, which can give deviations from neutrality. Hence, it allows the analysis of highly divergent sequences and the test of different alternative hypothesis as well. The application of the method to different human immunodeficiency virus type 1 and to foot-and-mouth disease virus genes confirms the action of positive selection on previously described regions as well as on new regions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Capsid Proteins / genetics
  • DNA, Viral / genetics
  • Evolution, Molecular
  • Foot-and-Mouth Disease Virus / genetics
  • Genes, Viral*
  • Genes, env
  • Genes, gag
  • HIV-1 / genetics
  • Phylogeny
  • RNA Viruses / genetics*
  • Reproducibility of Results
  • Selection, Genetic
  • Sequence Alignment / methods*
  • Sequence Alignment / statistics & numerical data
  • Viral Proteins / genetics*

Substances

  • Capsid Proteins
  • DNA, Viral
  • Viral Proteins