Send to:

Choose Destination
See comment in PubMed Commons below
Stat Appl Genet Mol Biol. 2013 Jun;12(3):361-74. doi: 10.1515/sagmb-2012-0033.

Statistical issues associated with modeling of synonymous mutation data.

Author information

  • 1Statistical and Applied Mathematical Sciences Institute, 19 T.W. Alexander Drive, P.O. Box 14006, Research Triangle Park, NC 27709-4006, USA.


The explosion of data in evolutionary bioinformatics has led to sometimes ad hoc, incomplete and even inaccurate data analyses. Taking dS data, namely, data on synonymous substitutions per synonymous sites, we go through a statistical analysis for modeling the time since duplications of genes. We explore the shortcomings of previous analyses, especially with a view towards their effect on inference for the gene duplication process. We present a statistical analysis which respects the assumptions of the models and the integrity of the data, and emphasize that exploratory data analysis, formulation of a data model, its estimation and finally, assessment of the model are important steps in a complete data analysis. Furthermore, for dS data, we develop Bayesian discrete-continuous mixture models and present analyses using two genomes.

[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for iFactory
    Loading ...
    Write to the Help Desk