Simulated genetic structure of a clonal population (**A-C**) and sexual population (**D-F**). All populations are evolving under neutral drift and are homogeneously mixing. Genetic maps (**A,D,G**), which are determined by principal co-ordinate analysis (), represent the genetic distances between 1,000 randomly chosen isolates from the simulated population after 10^{6} generations have elapsed. Co-ordinates are expressed in units of sequence divergence. An alternative way to represent clustering is the distribution of sequence divergence between pairs of isolates in the population (**B, E, H**). The thin lines show the distance between five random strains and all the other strains in the sample, while the thick red line shows the distribution of all the pairwise distances (thick red line). Where there is little clustering **(E)**, all pairwise distances are similar and the distribution has a single peak, while where there is strong clustering (**B, H**), the distribution has multiple peaks corresponding to pairwise comparisons within and between clusters. (**C, F, I**) show this distribution of pairwise comparisons evolving over 10^{6} generations. To normalise the distribution, pairs of isolates are compared for the number of alleles that are different, between 0 and 70, rather than for the proportion of base pairs, as in (**B, E, H**). The height of the distribution is represented by color shade, ranging from black (0.0) to red (>0.1), so that peaks in the (**B, E, H**) correspond to red shaded areas in (**C, F, I**). **C** and **I** show clusters moving apart, visible as red peaks moving up through time. When clusters split, a new peak appears at the bottom, while extinctions are apparent from peaks disappearing. **F** shows instead more stable population structure with a stable diffuse cluster being maintained throughout the simulation. Parameter values for θ and ρ, the population mutation and recombination rates, are θ=2, ρ=0.01 (**A-C**), ρ=20*10^{-18}^{x}, where *x* is the sequence divergence (**D-F**). We also explored under which conditions clustering could occur in the presence of high recombination rates (**G-I**). Clusters with high within cluster recombination can be generated, mimicking spontaneous speciation **(G-I)**, but require that recombination rate declines as a function of sequence divergence at a very rapid rate uncharacteristic of most bacteria studied to date, such that ρ=20*10^{-300x}.

## PubMed Commons