Send to

Choose Destination
Mol Biol Evol. 2015 Jan;32(1):244-57. doi: 10.1093/molbev/msu269. Epub 2014 Sep 22.

Evaluating the use of ABBA-BABA statistics to locate introgressed loci.

Author information

Department of Zoology, University of Cambridge, Cambridge, United Kingdom
Department of Zoology, University of Cambridge, Cambridge, United Kingdom.


Several methods have been proposed to test for introgression across genomes. One method tests for a genome-wide excess of shared derived alleles between taxa using Patterson's D statistic, but does not establish which loci show such an excess or whether the excess is due to introgression or ancestral population structure. Several recent studies have extended the use of D by applying the statistic to small genomic regions, rather than genome-wide. Here, we use simulations and whole-genome data from Heliconius butterflies to investigate the behavior of D in small genomic regions. We find that D is unreliable in this situation as it gives inflated values when effective population size is low, causing D outliers to cluster in genomic regions of reduced diversity. As an alternative, we propose a related statistic ƒ(d), a modified version of a statistic originally developed to estimate the genome-wide fraction of admixture. ƒ(d) is not subject to the same biases as D, and is better at identifying introgressed loci. Finally, we show that both D and ƒ(d) outliers tend to cluster in regions of low absolute divergence (d(XY)), which can confound a recently proposed test for differentiating introgression from shared ancestral variation at individual loci.


ABBA–BABA; Heliconius; gene flow; introgression; population structure; simulation

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Silverchair Information Systems Icon for PubMed Central
Loading ...
Support Center