Advances in DNA sequencing have made it possible to sample metagenomes,
amplicons, and multiple whole genomes from natural microbial populations, but it is a
major challenge to glean from such data the key evolutionary and ecological forces
that have shaped a population. This challenge is sharpened when diverse members of
a single species are found to coexist. We studied the cyanobacterial Synechococcus sp.,
a dominant member of the dense biofilms in the hot springs of Yellowstone National
Park, focusing on the extensive diversity found within a single sample. We carried out
deep amplicon sequencing of many loci and analyzed multiple statistical properties of
the data. We previously showed that the population has undergone an unexpectedly
high degree of homologous recombination, unlinking synonymous SNP-pair
correlations even on intragenic length scales. Here we investigate also the amino acid
and genic-level diversity focusing on evidence of selection and hints to the evolutionary
history. Surprisingly, some features of the data, including the spectrum of distances
between the genic-alleles, appear consistent with primarily asexual neutral drift. Yet,
the non-synonymous site frequency spectrum has too large an excess of low-frequency
polymorphisms to be purifying selection on deleterious mutations given the
distribution of coalescent times that we infer. And the population is not asexual.
Taken all together, these seemingly contradictory data imply that selection, epistasis,
and hitchhiking must be playing essential roles in creating and stabilizing the diversity.
We discuss potential roles that ecological sub-division at the organismal or genic level
may also play. From quantitative properties, including comparisons between two fully
sequenced genomes and previous metagenome data, we infer aspects of the history and
inter-spring dispersal of the meta-population since it was established in the
Yellowstone caldera. Our investigations illustrate the need for combining multiple
types of sequencing data and statistical analyses for developing understanding of sub
species-level diversity. Less...