We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

Results: 2

Figure 1

Figure 1. Recurrent themes in the analysis of microbial habitat adaptation. From: Combined phylogenetic and genomic approaches for the high-throughput study of microbial habitat adaptation.

Numbered topics in bold correspond to sections in the text (see main text for additional detail). In order to compare microbes across habitats, it is first necessary to define the environmental factors that structure microbial communities. Insights into this question can be gained by combining sequence data from community surveys (e.g. 16S rRNA or other marker gene sequences) with rich metadata (Topic 1), using ordination techniques (Topic 2). These results can then help to define (and refine) important habitat categories. Interactions between organisms (such as competition or cooperation) can be characterized using co-occurrence analysis (Topic 3). When well-defined and annotated habitat categories (or data on environmental parameters) are available, surveys of microbial communities can be combined with genome sequence data and phylogenetic trees to allow more detailed study of habitat adaptation. Such studies include phylogenetic comparative measures (Topic 4), detection of horizontal gene transfer (Topic 5), and ancestral state reconstruction (Topic 6). Application of these techniques in combination allows for inference of traits involved in habitat adaptation: these traits/habitat associations can then be put into a predictive framework using machine learning techniques (Topic 7) or ecological modeling. Traits predicted to be important for habitat adaptation can be selected for detailed experimental study (for example by mutagenesis followed by competition in microcosms).

Jesse RR. Zaneveld, et al. Trends Microbiol. ;19(10):472-482.

Figure 2. The importance of phylogenetic correction in comparing traits across habitats. From: Combined phylogenetic and genomic approaches for the high-throughput study of microbial habitat adaptation.

Consider the problem faced by an investigator seeking to test whether adaptation to a copiotrophic environment (Habitat 2) is correlated with acquisition of additional metabolic genes relative to an oligotrophic environment (Habitat 1). To illustrate how phylogenetic structure can complicate such analyses, panels (a) and (b) summarize hypothetical (simulated) data representing a case in which habitat adaptation and metabolic gene evolution are purely independent. Given gene presence/absence data derived from whole genome sequences (a), it may be tempting to use traditional statistical methods without phylogenetic correction to test whether the organisms habitat influences the number of metabolic genes in its genome. Naïve assessment of the effect of habitat on gene content using a non-phylogenetic test may lead an investigator to conclude that the increase in representation of metabolic genes between organisms found in Habitat 2 over Habitat 1 (58.53% vs. 49.7%) is statistically significant. In the example, illustrated in (a), the G-test for independence yields a highly significant p value (p = 0.00249), despite no actual connection between habitat and gene content. Examination of the phylogeny relating the genomes (b) reveals a great deal of phylogenetic structure that is ignored by any statistical test that does not incorporate evolutionary relatedness (including, but not limited to, the G-test). Instead, such non-phylogenetic statistical tests implicitly assume the unstructured ‘star’ phylogeny (panel b, inset). Ignoring the hidden patterns of correlation caused by shared evolutionary history in this manner frequently produces false positive results such as that in (a). For example, non-phylogenetic tests would ignore the correlations caused by the close phylogenetic relationship between lineages I and J as well as K and L (thus overcounting genes from the lineages). In this simple hypothetical example, we can readily observe that phylogenetically-unaware statistical methods can generate false positive results. This phenomenon is widely recognized in the literature on phylogenetic comparative methods (see references [59–64, 81] for more detailed discussion). To illustrate that the false positive result obtained in this hypothetical example is not specific to the details of the tree, nor the small number of genes depicted, we repeated the procedure depicted in (a) and (b) across 1000 simulated 256 taxon trees. In each case, 5000 binary characters (representing gene presence/absence), plus one habitat character were simulated in a purely neutral fashion. Because there was no genuine correlation between habitat and gene content, we would expect no more than a 5% false positive rate from a valid statistical test. However, in 38.4% (384/1000) of these hypothetical situations, a G-test of gene content versus habitat would falsely reveal a statistically significant result (p < 0.05). This simple example illustrates that the application of phylogenetic comparative measures in studies of microbial habitat adaptation should be considered essential (see Table 1 for available software, and references [59–64, 81] for studies that address this issue).

Jesse RR. Zaneveld, et al. Trends Microbiol. ;19(10):472-482.

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...
Write to the Help Desk