Send to

Choose Destination
Proc Natl Acad Sci U S A. 2019 Oct 15;116(42):21094-21103. doi: 10.1073/pnas.1818532116. Epub 2019 Sep 30.

A functional enrichment test for molecular convergent evolution finds a clear protein-coding signal in echolocating bats and whales.

Author information

Department of Developmental Biology, Stanford University, Stanford, CA 94305.
Department of Electrical Engineering, Stanford University, Stanford, CA 94305.
Biomedical Informatics Program, Stanford University, Stanford, CA 94305.
Department of Computer Science, Stanford University, Stanford, CA 94305.
Department of Molecular and Cellular Physiology, Stanford University School of Medicine, Stanford, CA 94305.
Department of Developmental Biology, Stanford University, Stanford, CA 94305;
Department of Pediatrics, Stanford University, Stanford, CA 94305.
Department of Biomedical Data Science, Stanford University, Stanford, CA 94305.


Distantly related species entering similar biological niches often adapt by evolving similar morphological and physiological characters. How much genomic molecular convergence (particularly of highly constrained coding sequence) contributes to convergent phenotypic evolution, such as echolocation in bats and whales, is a long-standing fundamental question. Like others, we find that convergent amino acid substitutions are not more abundant in echolocating mammals compared to their outgroups. However, we also ask a more informative question about the genomic distribution of convergent substitutions by devising a test to determine which, if any, of more than 4,000 tissue-affecting gene sets is most statistically enriched with convergent substitutions. We find that the gene set most overrepresented (q-value = 2.2e-3) with convergent substitutions in echolocators, affecting 18 genes, regulates development of the cochlear ganglion, a structure with empirically supported relevance to echolocation. Conversely, when comparing to nonecholocating outgroups, no significant gene set enrichment exists. For aquatic and high-altitude mammals, our analysis highlights 15 and 16 genes from the gene sets most affected by molecular convergence which regulate skin and lung physiology, respectively. Importantly, our test requires that the most convergence-enriched set cannot also be enriched for divergent substitutions, such as in the pattern produced by inactivated vision genes in subterranean mammals. Showing a clear role for adaptive protein-coding molecular convergence, we discover nearly 2,600 convergent positions, highlight 77 of them in 3 organs, and provide code to investigate other clades across the tree of life.


aquatic; coding; convergent evolution; echolocation; genome-wide functional enrichment tests

[Available on 2020-03-30]

Supplemental Content

Full text links

Icon for HighWire
Loading ...
Support Center