• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Sep 8, 2009; 106(36): E95.
Published online Aug 31, 2009. doi:  10.1073/pnas.0904550106
PMCID: PMC2741286

In defense of statistical methods for detecting positive selection

In a highly publicized article, Nozawa et al. (1) claimed that the branch-site model (BSM) (2, 3) was unreliable because it produced excessive false positives in their simulation experiment. BSM uses a likelihood ratio test to detect positive selection that affects particular branches and codons in protein-coding genes, indicated by accelerated nonsynonymous substitution rates. The authors' conclusion, if true, would be important. But it is contradicted by their simulation results.

The study generated 14,000 datasets under a null model that postulated no positive selection and found that BSM falsely detected positive selection in 32 cases. Nozawa et al. (1) claimed that those false positives were “not supposed to be obtained theoretically” and indicated “abnormal behaviors” of the likelihood ratio test. Those claims are false: the false-positive rate is only 0.23% (32 of 14,000), much lower than the nominal significance level (5%). Contrary to Nozawa et al.'s claims, the test is thus conservative. Nozawa et al. preferred a parsimony-based approach, which averages rates over the whole protein and achieved 0% false-positive rate in their simulation. The authors did not examine the power of the tests. In previous simulations (4), such parsimony-based methods were found to have little power, even when the likelihood ratio tests detected positive selection with ≈100% power.

We suggest that sensible use of statistical methods for detecting positive selection such as BSM (5) is valuable in comparative analysis of genomic data. They can generate biological hypotheses for experimental verification, narrowing down possibilities for test in the laboratory. Nozawa et al.'s results, interpreted correctly, support this view, as do many studies in which the statistical predictions were validated in the laboratory.

Footnotes

The authors declare no conflict of interest.

References

1. Nozawa M, Suzuki Y, Nei M. Reliabilities of identifying positive selection by the branch-site and the site-prediction methods. Proc Natl Acad Sci USA. 2009;106:6700–6705. [PMC free article] [PubMed]
2. Yang Z, Wong WSW, Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22:1107–1118. [PubMed]
3. Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22:2472–2479. [PubMed]
4. Wong WSW, et al. Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics. 2004;168:1041–1051. [PMC free article] [PubMed]
5. Yang Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–1591. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

  • PubMed
    PubMed
    PubMed citations for these articles