Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. 2011 Mar 15; 108(Suppl 1): 4578–4585.
Published online 2010 Jul 28. doi:  10.1073/pnas.1000081107
PMCID: PMC3063592
Colloquium Paper

Succession of microbial consortia in the developing infant gut microbiome


The colonization process of the infant gut microbiome has been called chaotic, but this view could reflect insufficient documentation of the factors affecting the microbiome. We performed a 2.5-y case study of the assembly of the human infant gut microbiome, to relate life events to microbiome composition and function. Sixty fecal samples were collected from a healthy infant along with a diary of diet and health status. Analysis of >300,000 16S rRNA genes indicated that the phylogenetic diversity of the microbiome increased gradually over time and that changes in community composition conformed to a smooth temporal gradient. In contrast, major taxonomic groups showed abrupt shifts in abundance corresponding to changes in diet or health. Community assembly was nonrandom: we observed discrete steps of bacterial succession punctuated by life events. Furthermore, analysis of ≈500,000 DNA metagenomic reads from 12 fecal samples revealed that the earliest microbiome was enriched in genes facilitating lactate utilization, and that functional genes involved in plant polysaccharide metabolism were present before the introduction of solid food, priming the infant gut for an adult diet. However, ingestion of table foods caused a sustained increase in the abundance of Bacteroidetes, elevated fecal short chain fatty acid levels, enrichment of genes associated with carbohydrate utilization, vitamin biosynthesis, and xenobiotic degradation, and a more stable community composition, all of which are characteristic of the adult microbiome. This study revealed that seemingly chaotic shifts in the microbiome are associated with life events; however, additional experiments ought to be conducted to assess how different infants respond to similar life events.

Keywords: human gut, metagenomics, microbial diversity, community assembly, short chain fatty acids

The assembly of the human gut microbiota begins during birth with colonization by microbes from the environment. In the first few hours of life, the mother's vaginal and fecal microbiomes are usually the most important source of inoculum (1, 2). During the initial few months of a milk diet, bacteria such as Bifidobacteria, highly adapted to process milk oligosaccharides, can be abundant (3). The introduction of solid foods heralds a shift toward bacterial consortia characteristic of the adult microbiota (4).

Although before weaning, the diet is a relatively constant supply of milk, during this time the microbiome can display large shifts in the abundances of bacterial taxa. For instance, in a time series analysis of 14 infants, Palmer et al. (4) documented fluctuations in the abundances of major bacterial taxonomic groups, and the temporal patterns of variation differed between individuals. Interpersonal variation in gut microbial diversity is greater between infants than between adults, and furthermore, the infant microbiome displays more interpersonal variability in functional gene content than the adult microbiome (5). The large functional and phylogenetic variation observed between infant gut microbiomes may be due to random colonization events, differences in immune responses to the colonizing microbes, changes in host behavior, or other aspects of host lifestyle (4, 6). How each of these factors contributes to shaping the infant microbiome remains unclear.

To investigate how life events impact the developing infant gut microbiome, we performed a case study to monitor the gut microbial composition of one infant over a period of 2.5 y. We analyzed a set of more than 60 fecal samples collected concurrently with detailed information regarding diet, health status, and activities. The infant was a full-term, vaginally delivered healthy male. He was placed in a daycare facility during weekdays starting at 3 mo and then removed from group care at 1 y. His diet regimen consisted of exclusive breast-feeding for the first 134 d of life, supplemented with formula until he was no longer breast-fed at 9 mo. The first solid food introduced to the diet was rice cereal at 4 mo, followed by table foods, and the replacement of formula with cow milk at 1 y. The child suffered from several ear infections for which he was treated with antibiotics, but was otherwise healthy, and he was immunized according to the US Centers for Disease Control and Prevention's recommended schedule.

We profiled the bacterial diversity of the fecal samples with 454-pyrosequencing: First, we generated 318,620 16S rRNA gene sequences (Table S1), which we used to map the dynamics of the developing microbiota onto a timeline of changes in diet and other life events. On the basis of the patterns observed from the 16S rRNA gene analysis, we performed a metagenomic analysis of >500,000 sequences from 12 samples to study in greater detail key transitions in microbial community composition triggered by life events (Table S2). These data were used to address the following questions: How does the diversity of the microbiota relate to the functional gene content of the microbiome over time? How are the communities that constitute the microbiota structured? How do changes in diet and events, such as antibiotic treatment, affect the succession and functions of bacteria consortia? This analysis allowed us to pinpoint specific events (e.g., illness, diet change, and antibiotic treatment) likely to have triggered significant changes in this infant's intestinal microbiota.


16S rRNA Gene Analysis Reveals Temporal Patterns of Qualitative Diversity.

For each sample, we measured phylogenetic diversity (PD), the sum of all of the branch lengths in a 16S rRNA gene phylogenetic tree: the greater the PD, the greater the diversity represented in the sample (7). As expected, PD increased over time and was positively correlated with age (R2 = 0.5; Fig. 1). The first stool sample produced by the infant (meconium, a tarry substance consisting of the in utero accumulation of gut luminal material) had the lowest PD, and the sample with the highest PD was the mother's sample collected on the same day. There are several time points that deviate from the general trend of increasing PD over time (Fig. 1). Day 85, a time point just before a fever, had a low PD compared with preceding days; day 168, when peas and formula were introduced to the diet, had a relatively high PD compared with the previous sample day; and day 195 also had a high PD; however, this was not associated with any documented changes. Two of the three antibiotic treatments are followed by a decrease in PD relative to previous sample days. Although PD for day 244 is located on the trend line illustrated in Fig. 1, it is lower than previous sample days. The second treatment with amoxicillin, however, does not seem to affect the PD of the infant's microbiome as judged by 16S rRNA sequence analysis of sample day 297; this may be an indication of the adaptive power of the human microbome as it pertains to multiple exposures to the same antibiotic. Consistent with the infant's first amoxicillin treatment, a low PD is observed on days 413, 432, and 441 after the infant's first exposure to the antibiotic cefdinir (a broad-spectrum cephalosporin).

Fig. 1.
Bacterial PD of the infant gut microbiota over time. PD provides a measure of the diversity within a community based on the extent of the 16S rRNA phylogenetic tree that is represented by that community. Symbols are fecal samples. The mother's fecal sample, ...

In addition to comparing samples using measures of PD, we performed a principal coordinates analysis (PCoA) of unweighted UniFrac (8) to determine how the diversity among samples changed during the sampling period. This analysis showed that the diversity changed gradually over time (Fig. 2 A–D). Fecal samples collected early in the time series harbored microbial communities more similar to one another than to samples collected later on, and vice versa. The samples that deviate from this diversity gradient, days 413, 432, and 441, are the samples noted above with a lower relative PD. Samples associated with the same diet are adjacent in the gradient because they were collected from the same period in the infant's life. For instance, breast-milk, formula, and solid food associated samples form a contiguous pattern in the PCoA plot (Fig. 2 B–D).

Fig. 2.
Community composition changes over time conform to a smooth temporal gradient. Time and PC1 from a PCoA of bacterial communities determined from 16S rRNA genes are plotted. (A) The color gradient corresponds to time (days): earlier samples are darkest ...

Succession of Bacterial Consortia and Patterns of Quantitative Diversity.

The abundance of operational taxonomic units (OTUs) was assessed across all samples, and OTUs were clustered in a heat map according to their cooccurrence (Fig. 3A). This clustering analysis revealed a succession of bacterial communities that resolved four discrete phases (steps) initiated by life events (e.g., fever at day 92 separates step 1 from step 2, diet change at day 161 divides steps 2 and 3, and antibiotic treatment and adult diet at day 371 divides steps 3 and 4). A linear discriminant analysis (LDA) was carried out to assess the statistical significance of these four steps: the a posteriori assignment probabilities of the steps indicate whether the fecal samples can be properly assigned to the steps given their community structure. Thus, we assigned the four steps as a priori categories in the LDA, and the resulting posterior probabilities for steps 1–4 were 0.90, 0.64, 0.76, and 0.71, respectively (Table S3). These results indicate that the steps can be differentiated according to the bacterial consortia of their respective fecal samples.

Fig. 3.
OTU-based community structure and composition in the gut microbiota. (A) Each vertical lane corresponds to a sample day, and the gray-scale shaded rectangles represent the abundance of the different OTUs. The dendogram on the left shows how the OTUs are ...

In step 1 (days 3–84; Fig. 3A), the gut microbiome comprises a specific suite of Firmicute OTUs. Step 2 is preceded by an increase in the abundance of proteobacterial OTUs (days 92–100), which coincided with fever symptoms. Actinobacterial and proteobacterial OTU abundances increased in step 2, and the suite of Firmicute OTUs observed in step 1 differed from those observed in step 2. The introduction of formula and peas to the infant's diet is associated with an increase in bacteroidetes in step 3 (days 172–297) that continues in step 4 (days 454–838); however, the specific Bacteroidetes OTUs enriched differ between these two steps (Fig. 3 A and B). The transition phase (days 371–441) from steps 3 to 4 is characterized by a number of environmental changes, including cefdinir treatment for an ear infection, exclusion of breast milk and formula from the diet, and an introduction of cow milk and a full adult diet. Interestingly, the transition phase preceding step 4 comprises OTUs that are typical of those observed during step 1, and therefore appear as outliers; again, these are the same samples that are outliers in the PD and UniFrac patterns (Figs. 1 and and2).2). Because this is a case study, we cannot attribute any single life event as the definitive pressure leading to the formation of the gut microbiomes defined within step 4. One scenario is that this change in the infant's microbiome may have been induced by a purge in PD as a result of cefdinir treatment. The microbial landscape in the gut could then reform according to substrates that are typical of an adult diet. Regardless, the abundances of bacterial phyla are relatively constant in step 4: this constancy among samples collected over more than 400 d is an indication that the infant gut microbiome has reached a stable state.

Species Cooccurrence and Exclusions.

Because our OTU-based cluster analysis revealed a succession of different microbial consortia over time (Fig. 3A), we tested whether the developing infant's gut microbiota was subject to community assembly rules. Specifically, we invoked two measures that assess OTU cooccurrence: the C-score and checkerboard measures. The C-score assesses the tendency for species to exclude one another from a given niche (9), whereas the number of checkerboard pairs corresponds to the number of species pairs that never cooccur (10). To assess the significance of the scores obtained from the dataset, we compared the C-score and checkerboard indices from actual data with scores obtained from 5,000 communities assembled randomly from the same OTU data. The C-score for the real dataset was 38.97, which is significantly greater than the simulated mean C-score of 35.98 obtained from the randomized data (P < 0.0002). The checkerboard measure for the microbial communities (2,561.00) was also significantly greater than the randomized mean checkerboard measure (2, 321.03, P < 0.0002; Fig. 4 A and B). Together, these ecological measures indicate that the developing infant gut microbiota is composed of interacting bacterial consortia, not of randomly assembled suites of bacteria.

Fig. 4.
Community assembly is nonrandom. (A) C-score distributions for observed and randomized OTU occurrence in each sample. (B) Checkerboard indices for observed and randomized OTU occurrence. Values for the observed distributions are indicated with arrows. ...

Bacterial Load and Diversity in Relation to Short Chain Fatty Acid Concentrations.

To gain insight into how community structure relates to microbial metabolite pools, we checked for relationships between bacterial diversity and short chain fatty acid (SCFA) concentrations in fecal samples (Fig. 5 A–C). Specifically, we measured the concentration of acetate, propionate, and butyrate in 56 fecal samples by GC-MS, and bacterial load by quantitative PCR. Overall, levels of acetate were highest and butyrate lowest, and levels of all three SCFAs were highly correlated with each other (Fig. 5B). SCFA levels and bacterial load were generally higher after the introduction of solid foods (Figs. S1 and S2). Bacterial diversity was correlated with all three SCFAs: PC1 of the unweighted UniFrac PCoA was negatively correlated with all three SCFAs (R2: 0.3, 0.4, and 0.1 for acetate, propionate, butyrate, respectively, P < 0.001). A regularized canonical correlation analysis (RCCA) indicated that Bacteroidetes abundances were positively correlated with all three SCFAs and most strongly with propionate levels (Fig. 5C and Fig. S3). Verrucomicrobia were also positively correlated with acetate and propionate levels. In contrast, the abundance of Firmicutes correlated negatively with all three SCFAs and most strongly with propionate. Collectively, these measures suggest that community assembly is nonrandom and likely reflects syntrophic and antagonistic relationships mediated by microbial metabolites.

Fig. 5.
Relationships between phyla abundances and levels of SCFAs in feces. (A–C) Correlation matrices. (A) Phyla (V, Verrucomicrobia; P, Proteobacteria; F, Firmicutes; C, Cyanobacteria; B, Bacteroidetes; A, Actinobacteria). (B) SCFAs (A, acetate; P, ...

PD of Metagenomic Sequences.

Taxonomic assignment was determined using BLASTX (11), and the majority of sequences were bacterial genes. Low levels of fungi and viruses were also detected, and Euryarcheaota (Archaea) were detected in all samples including meconium (<0.01% of sequences). The majority of DNA sequences extracted from fecal samples collected at the beginning of the time series (meconium and day 6) were assigned to the Firmicute phylum (Fig. 6A), which is consistent with our PCR-based 16S rRNA gene survey for day 6 (Fig. S4). However, our 16S rRNA gene results for days 92–118, which show an abundance of Firmicute OTUs, are inconsistent with the abundance of actinobacterial genes obtained from these samples, likely reflecting 16S rRNA gene primer bias. Furthermore, the metagenomic analysis recovered fewer proteobacterial genes compared with the 16S rRNA gene-based analysis. Nevertheless, the patterns obtained from these two methods are consistent overall for this time period: the highest levels of actinobacterial and proteobacterial sequences were observed on sample days 92–118 in both analyses (Fig. S4). Interestingly, day 92, which was associated with fever, has the highest viral and fungal levels (Fig. 6A). Later in the time series (days 413, 432, and 441 after diet change and cefdinir treatment), the relative decrease in levels of bacteroidetes OTUs observed by 16S rRNA analysis was not observed in our BLASTX taxonomic assignment of metagenomic sequences (Fig. S4).

Fig. 6.
Metagenomic analysis of DNA sequences extracted from infant fecal DNA. (A) Taxonomic assignment of metagenomic sequences. (B) Heat map and hierarchical clustering of samples based on MG-RAST subsystem gene content.

Functional Gene Dynamics in the Developing Infant Gut Microbiome.

We used the Meta Genome Rapid Annotation using Subsystem Technology (MG-RAST) (12) to assign gene functions to the 12 metagenomic samples. A summary of these results is represented as normalized heat maps, also generated using MG-RAST (Fig. 6B). According to relative abundances of subsystems, samples clustered into three main groups that reflect the time period of sample collection (Fig. 6B). We ran bootstrapping and resampling analyses to identify genes that were enriched in samples relative to an average representation of genes across the 12 samples (Table S4). Analysis of the meconium sample (day 3) revealed an enrichment of carbohydrate-metabolizing genes involved in lactose/galactose and sucrose uptake and utilization, genes involved in antibiotic resistance (e.g., ABC transporters), and virulence genes (e.g., multidrug resistance efflux genes, adhesion proteins, and pathenogenicity islands). Day 6 also had many of the same enriched gene functions as the meconium samples, in addition to gene functions associated with cell membrane and cell wall components (Table S4). Furthermore, on day 6, genes associated with vitamin biosynthesis (e.g., vitamin B12, folate) were already present in the infant microbiome. By day 85, carbohydrate-using genes for amylose, arabinose, and maltose degradation, and virulence genes such as type III and IV secretion systems are enriched (Table S4). At day 92, when fever occurred, enriched eukaryotic rRNA modification genes likely reflect higher relative levels of fungi (Table S4). Carbohydrate-using genes enriched on day 92 include rhamnose, fructooligosaccahride and raffinose-utilization pathways, and xylose-degradation genes. Enrichment of sialic acid metabolism genes (day 85) and β-glucoronide utilization genes (day 100) may indicate that the microbiota is capable of using or mimicking host glycans early in life. Furthermore, in days 98, 100, and 118, before the introduction of the first solid food, additional genes for the utilization of plant-derived glycans, such as xylitol, are present (Table S4).

At the later time points (days 371, 431, 441, and 454) a complement of genes associated with the adult microbiome's core metabolic functions, namely polysaccharide breakdown, vitamin biosynthesis, and xenobiotic degradation, are evident. For instance, on day 371, genes for the utilization of maltose, maltodextrin, xylose, and mannose, which are polysaccharide breakdown products, are enriched. In addition, vitamin and cofactor biosynthesis genes including vitamin B6, thiamin, and flavodoxin are enriched on these sample days. Finally, genes reflecting the diversity of substrates in an adult diet were recovered; for example, genes for cinnamic acid degradation (day 432), benzoate catabolism (day 441), and additional enzymes involved in the anaerobic degradation of aromatic compounds (day 454) are present.

Relating Function to Phylogeny in the Infant Gut Microbiome.

We used RCCA to compare samples according to their gene content (Fig. S3 A–D). One step in RCCA is to correlate the abundances of phyla (from the phylogenetic assignment of genes) across samples; this revealed that the abundances of genes assigned to the Firmicute, Bacteroidetes, and Euryarcheal phyla were positively correlated. In addition, the actinobacterial and proteobacterial gene content of samples was positively correlated (Fig. S3). RCCA resolved clusters of samples and indicated which functional genes were driving the clustering (Fig. S3 and Tables S5 and S6). Meconium, day 6, and day 85 form a cluster because they are enriched in genes taxonomically assigned to the Firmicutes, and their functions include Gram-positive cell wall components and central carbohydrate and organic acid metabolism (Fig. S3 A and B and Table S6). The sample from day 92, associated with fever, is clearly separated from the other samples in the analysis because it is enriched with genes assigned to the fungal phylum (Fig. S3 C and D and Table S6). The following days (98, 100, and 118) also separate from other metagenomic samples and are characterized by genes encoding ABC transporters and assigned phylogenetically to the Actinobacteria and Proteobacteria, (Fig. S3D and Table S6). Interestingly, the abundance of actinobacterial and proteobacterial genes is strongly negatively correlated to the abundance of Firmicute genes. Days 371, 432, 441, and 454 are clustered because of their Bacteroidetes gene content, and this pattern is driven by an enrichment in genes related to carbohydrate fermentation, Gram-negative cell wall, and capsule formation (Fig. S3 A and B and Table S6).


An essential goal of the human microbiome project is to understand the assembly and community composition of the microbiota, not only to gain a better understanding of our own biology, but also because the microbiome is implicated in human health (13). Gut microbiotas can contribute to excess host adiposity (1416), protect against the development of type 1 diabetes (17), and induce colitis (18) and metabolic syndrome (19). Thus, the microbiota has been suggested as a target for therapeutic intervention for several chronic diseases (13, 2022). Adult microbiotas are thought to be relatively stable over time (14, 23, 24); this stability imparts resilience to disturbance, ensuring continued gut function. In a disease context, however, such stability and resilience could be detrimental if the gut community is pathogenic. Understanding the succession of bacterial consortia in the human gut during childhood may help in the development of strategies to guide the formation of health-promoting microbiotas that could then be maintained throughout the life of the host.

Our study of the gut microbiome of one infant followed over a 2.5-y period allowed an in-depth look into the dynamics of a developing intestinal ecosystem in relation to known disturbances. We observed a gradual increase in diversity over time, related to a gradual change in community diversity. Superimposed on these patterns of gradual change are the effects of life events, such as drastic diet changes or antibiotic treatments, which result in large shifts in the relative abundances of taxonomic groups. The qualitative measures of diversity, such as PD and UniFrac, responded to time, but the quantitative measures, such as the specific abundances of OTUs assembled into consortia of interacting species, responded to life events. Additional studies considering multiple subjects will assess whether infant microbiomes respond consistently to the same life events.

Our metagenomic analyses provided additional insight into the dynamics of the developing microbiome. For instance, the infant suffered a fever at day 92, during the exclusively breast-milk–fed period, which is followed by a shift in the abundances of a specific suite of OTUs. Fungal and viral genes were enriched at that time, suggesting a transient imbalance in the microbiota that might have been directly related to the fever. Another noteworthy observation was that genes facilitating the breakdown of plant-derived polysaccharides were present during this period, despite an exclusive breast-milk diet. This second observation is consistent with other metagenomic analyses of infant gut microbiomes, which reported microbial enzymes that degrade nondigestible polysaccharides of plant origin (2, 5). Together these studies suggest that the infant microbiome is metabolically ready for receiving simple plant-derived foods, such as rice cereal. This may explain why the introduction of rice cereal did not result in detectable changes in the 16S rRNA gene profiles in this intant's gut microbiome.

The introduction of peas and formula, followed by other table foods, may have been the cause of a codominance of the Bacteroidetes and Firmicutes and enrichment in functional genes characteristic of the adult gut microbiome. In addition to carbohydrate-using genes used for the breakdown of plant polysaccharides, functional genes present in the weaned infant microbiome included those involved in the breakdown of xenobiotic compounds and in vitamin biosynthesis. The abundances of bacterial phyla were relatively constant after weaning, indicating that the infant gut microbiome has reached a stable state. Together these results suggest that the 2.5-y-old human gut microbiome has many of the functional attributes of the adult microbiome.

The fine-scale temporal sampling allowed us to test whether the gut microbial community was subject to ecological assembly rules over time. The C-score and checkerboard analyses, which test for species cooccurrence and exclusion, strongly support a nonrandom pattern of community assembly. The human gut microbiota is known to be composed of syntrophic partners (25), as well as competing members (26, 27). Such ecological interactions likely underlie the nonrandom associations of species constituting the microbiota.

The introduction of table foods was followed by a large shift in phyla abundances within the infant's microbiome, in addition to increased bacterial loads and SCFA levels. Although specific members of the Firmicute phylum, such as Roseburia spp., are known to produce butyrate and respond to carbohydrate levels in the diet (28), this analysis did not detect positive relationships between Firmicute OTUs and SCFA levels, perhaps because a wide variety of gut bacteria can produce these metabolites. However, our 16S rRNA gene analysis showed a dramatic and sustained increase in the abundance of Bacteroidetes immediately after the introduction of peas and other table foods to the diet. The Bacteroidetes are specialized in the breakdown of complex plant polysaccharides (29); the introduction of plant-derived carbohydrates into the diet could have boosted populations of Bacteroidetes, which is consistent with mouse microbiome studies (30). The metabolic activities of these Bacteroidetes may have either directly or indirectly increased production of SCFAs. Consistent with these observations, low levels of Bacteroidetes in the gut are correlated with obesity, which itself may result from a diet low in plant-derived polysaccharides (23, 31). Thus, together these results further support the notion that a diet high in plant material promotes a microbial community structure and metabolite production that is beneficial to the human host.

This study revealed the power of sampling a microbiome over time to gain insight into the events that can alter its phylogenetic and functional composition. Our results complement those of Palmer et al. (4), who documented large compositional shifts in the abundances of major bacterial taxa over time in 14 babies, which they postulated could be a reflection of life events. We also observed large shifts in the abundances of major groups; interestingly, these shifts are associated with life events, such as illnesses, dietary changes, and antibiotic treatment, suggesting that differences in the colonization patterns of multiple babies would most likely reflect differences in their daily lives. Indeed, future temporal human microbiome studies should be performed in parallel to assess whether individual microbiomes respond differently to the same disturbances.


Samples and DNA Extraction.

This study was approved by the Internal Review Board of Washington University in St. Louis (protocol no. 09-0039), and samples were transferred to Cornell University under protocol no. 0910000952. Fecal samples were collected from a full-term, healthy infant during diaper changes. The birth was vaginal, no antibiotics were administered to the mother or the baby at birth, and the mother was antibiotic-free for the duration of the pregnancy. Samples were immediately frozen upon collection at −20 °C, then transferred to the laboratory and maintained at −80 °C until processing. Frozen samples were ground under liquid N2, then a subsample of ≈100 mg was used for whole-community DNA extraction. A 100-mg aliquot of each homogenized sample was suspended while frozen in a solution containing 500 mL of DNA extraction buffer [200 mM Tris (pH 8.0), 200 mM NaCl, and 20 mM EDTA], 210 mL of 20% SDS, 500 mL of a mixture of phenol/chloroform/isoamyl alcohol (25:24:1), and 500 mL of a slurry of 0.1-mm-diameter zirconia/silica beads (BioSpec Products). Microbial cells were then lysed by mechanical disruption with a bead beater (BioSpec Products) set on high for 2 min (22 °C), followed by extraction with phenol/chloroform/isoamyl alcohol and precipitation with isopropanol. The quantity and quality of purified DNA was assessed using the Quant-iT PicoGreen dsDNA Assay Kit (Invitrogen) and a plate reader.

Sample Preparation for 454 Pyrosequencing of 16SrRNA Genes.

16S rRNA genes were amplified from each sample using a composite forward primer and a reverse primer containing a unique 12-base barcode, which was used to tag PCR products from respective samples (31). We used the forward primer 5′-GCCTTGCCAGCCCGCTCAGTCAGAGTTTGATCCTGGCTCAG-3′: the italicized sequence is 454 Life Sciences primer B, and the bold sequence is the broadly conserved bacterial primer 27F. The reverse primer used was 5′-GCCTCCCTCGCGCCATCAGNNNNNNNNNNNNCA-TGCTGCCTCCCGTAGGAGT-3′: the italicized sequence is 454 Life Sciences’ primer A, and the bold sequence is the broad-range bacterial primer 338R. NNNNNNNNNNNN designates the unique 12-base barcode used to tag each PCR product (31, 32), with “CA” inserted as a linker between the barcode and rRNA primer. PCR reactions consisted of HotMaster PCR mix (Eppendorf), 200 μM of each primer, and 10–100 ng template, and reaction conditions were 2 min at 95 °C, followed by 30 cycles of 20 s at 95 °C, 20 s at 52 °C, and 60 s at 65 °C on an Eppendorf thermocycler. Three independent PCRs were performed for each sample, combined and purified with Ampure magnetic purification beads (Agencourt), and products visualized by gel electrophoresis. No-template extraction controls were analyzed for lack of visible PCR products. Products were quantified using Quant-iT PicoGreen dsDNA assay as described above. A master DNA pool was generated from the purified products in equimolar ratios to a final concentration of 21.5 ng mL−1. The pooled products were sequenced using a Roche 454 FLX pyrosequencer at the Cornell University LIfe Sciences Core Laboratories Center.

16S rRNA Gene Sequence Analysis.

Sequences generated from pyrosequencing barcoded 16S rRNA gene PCR amplicons (average length 237 nt; Table S1) were analyzed using default settings in the open source software package Quantitative Insights Into Microbial Ecology (QIIME; http://qiime.sourceforge.net). 16S rRNA gene sequences were assigned to OTUs using the QIIME implementation of cd-hit (33) and a threshold of 97% pairwise identity. OTUs were classified taxonomically using the Ribosomal Database Project (RDP) classifier 2.0 (34). A single representative from each OTU was aligned using PyNast (35) to build the phylogenetic tree used to for measuring the PD of samples (7) and unweighted UniFrac (36).

Cooccurrence analysis.

The C-score and checkerboard indices (9) were determined using a null hypothesis of random community assembly, whereby 5,000 matrices were randomly generated from the 16S rRNA gene 0.97 OTU data with EcoSim Version 7.0. C-score and checkerboard distributions and P values were determined from the simulations using EcoSim's default settings.

Clustering analysis.

Rarified (randomly subsampled to normalize sequence counts) OTUs, with an abundance greater than 5% and present in two or more samples, were hierarchically clustered using Kendall's τ similarity metric. The Self Organizing Map was generated using 20,000 iterations, also using the Kendall's τ similarity metric, in the freeware Cluster 3.0 (http://www.falw.vu/~huik/cluster.htm). Heat map graphics were generated using JavaTreeView (37). An LDA was carried out in R for studying multivariate clustering of fecal samples according to their associated microbiotas (abundances of different RDP-assigned classes).

Metagenomic Analysis of the Infant Gut Microbiome.

A metagenomic analysis was used to assess the diversity of microbial genes within the infant gut microbiome at different sample days. We studied three time periods: the early infant gut microbial communities (the meconium at day 3, and day 6), days associated with fever (days 85–118), and one time range associated with cefdinir treatment and diet change (days 371–454). Twelve whole-community fecal DNA samples were barcoded, pooled, and shotgun sequenced using the Roche-454 Titanium pyrosequencer. After filtering low-quality reads, we obtained a total of 482,919 sequences (Table S2).

Metagenomic sequences were trimmed using the CLC Genomic Work Bench 3.0. The minimum allowable sequences length was 100 bp, quality score limit was 0.05, only two ambiguous nucleotides were permitted per sequence, and a hit limit of moderate was used to identify and remove vector sequences. The 454 replicate filter software (38) was used to remove sequences that were artificially replicated during the sequencing protocol. Filtered nucleotide metagenomic sequences were compared with the September 27th, 2009 version of the National Center for Biotechnology Information nonredundant database (nr) using BLASTX (11), and results were visualized in MEGAN (39) to determine the taxonomic distribution of genes in each library (i.e., the best BLASTX result using a maximum e-score of 10−5 was used as an approximation for the taxonomic origin of a given sequence). Metagenomic sequences were functionally annotated using MG-RAST (http://metagenomics.nmpdr.org), built as a modified version of the RAST server (12). Normalized heat maps were also generated using MG-RAST, and the different gene pool arrays were hierarchically using Cluster 3.0. An RCCA was performed to highlight correlations between the phylum abundance matrix (X of order n × p) and the gene functions matrix (Y of order n × q) retrieved from metagenomics as well as bacterial phyla and SCFAs using the R software CCA package (40). Regularization parameters λ1 and λ2 were chosen to maximize the leave-one-out cross-validation score (41).

Quantitative PCR Analysis.

Real-time PCR amplification and detection were performed using an ABI 7300 Real Time PCR System (Applied Biosystems). We used the Power Sybr Green PCR Master Mix (Applied Biosystems), including 0.2 μM of 16S rRNA primers 8F (5′ AGAGTTTGATCCTGGCTCAG) and 338R (5′ CTGCTGCCTCCCGTAGGAGT). Cycling conditions included an initial incubation of 50 °C for 2 min, denaturing at 95 °C for 10 min, then 40 cycles of 95 °C for 15 s, 60 °C for 1 min, and a dissociation curve step of 95 °C for 15 s, 60 °C for 30 s, and 95 °C for 15 s.

SCFA Analysis.

For each sample, 200 mg of frozen feces was vortexed for 1 min in 1% HCl. Isotope-labeled SCFAs were added in a final concentration of 5 mM [1-13C] acetate, 1 mM [2H5] propionate (Cambridge Isotopes), and 1 mM [2H5] propionate (Sigma Aldrich). Homogenized samples were centrifuged at 2,350 × g for 30 s. Supernatant was acidified to pH 0 with HCl. Each sample was partitioned into four aliquots and extracted at 4 °C with an equal volume of diethyl ether. Samples were incubated with 1-tertbutyl-dimethyl-silyl-imidazole (Sigma Aldrich) at 60 °C for 30 min before GC-MS analysis (Agilent 5975C Series; Agilent Technologies).

Supplementary Material

Supporting Information:


We thank Jeffrey Gordon for his support and Jeffrey Werner for comments on the manuscript. This research was supported by National Human Genome Research Institute grants (to R.K.) and an Arnold and Mabel Beckman Foundation Young Investigator award (to R.E.L.).


The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: All 16S rRNA gene and metagenomic sequence data are archived in GenBank (accession no. SRA012472).

This paper results from the Arthur M. Sackler Colloquium of the National Academy of Sciences, “Microbes and Health,” held November 2–3, 2009 at the Arnold and Mabel Beckman Center of the National Academies of Sciences and Engineering in Irvine, CA. The complete program and audio files of most presentations are available on the NAS Web site at http://www.nasonline.org/SACKLER_Microbes_and_Health.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1000081107/-/DCSupplemental and http://microbe.calsnet.cornell.edu/leylab/fileshare/ITS_PNAS_SI/.


1. Gueimonde M, et al. Effect of maternal consumption of lactobacillus GG on transfer and establishment of fecal bifidobacterial microbiota in neonates. J Ped Gastroenterol Nutr. 2006;42:166–170. [PubMed]
2. Vaishampayan PA, et al. Comparative metagenomics and population dynamics of the gut microbiota in mother and infant. Genome Biol Evol. 2010;2010:53–66. [PMC free article] [PubMed]
3. Sela DA, et al. The genome sequence of Bifidobacterium longum subsp. infantis reveals adaptations for milk utilization within the infant microbiome. Proc Natl Acad Sci USA. 2008;105:18964–18969. [PMC free article] [PubMed]
4. Palmer C, Bik EM, DiGiulio DB, Relman DA, Brown PO. Development of the human infant intestinal microbiota. PLoS Biol. 2007;5:e177. 10.1371/journal.pbio.0050177. [PMC free article] [PubMed]
5. Kurokawa K, et al. Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res. 2007;14:169–181. [PMC free article] [PubMed]
6. Dethlefsen L, Eckburg PB, Bik EM, Relman DA. Assembly of the human intestinal microbiota. Trends Ecol Evol. 2006;21:517–523. [PubMed]
7. Faith DP. Conservation evaluation and phylogenetic diversity. Biol Conserv. 1992;61:1–10.
8. Lozupone C, Knight R. UniFrac: A new phylogenetic method for comparing microbial communities. Appl Environ Microbiol. 2005;71:8228–8235. [PMC free article] [PubMed]
9. Stone L, Roberts A. The checkerboard score and species distributions. Oecologia. 1990;85:74–79.
10. Diamond JM. Assembly of species community. In: Cody ML, Diamond JM, editors. Ecology and Evolution of Communities. Cambridge, MA: Harvard Univ Press; 1975. pp. 342–444.
11. Altschul SF, et al. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]
12. Meyer F, et al. The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics. 2008;9:386. 10.1186/1471-2105-9-386. [PMC free article] [PubMed]
13. Turnbaugh PJ, et al. The human microbiome project. Nature. 2007;449:804–810. [PMC free article] [PubMed]
14. Turnbaugh PJ, et al. A core gut microbiome in obese and lean twins. Nature. 2009;457:480–484. [PMC free article] [PubMed]
15. Ley RE, et al. Obesity alters gut microbial ecology. Proc Natl Acad Sci USA. 2005;102:11070–11075. [PMC free article] [PubMed]
16. Turnbaugh PJ, et al. An obesity-associated gut microbiome with increased capacity for energy harvest. Nature. 2006;444:1027–1031. [PubMed]
17. Wen L, et al. Innate immunity and intestinal microbiota in the development of Type 1 diabetes. Nature. 2008;455:1109–1113. [PMC free article] [PubMed]
18. Garrett WS, et al. Communicable ulcerative colitis induced by T-bet deficiency in the innate immune system. Cell. 2007;131:33–45. [PMC free article] [PubMed]
19. Vijay-Kumar M, et al. Altered gut microbiota in toll-like receptor-5 deficient mice results in metabolic syndrome. Science. 2010;328:228–231. [PubMed]
20. Zaneveld J, et al. Host-bacterial coevolution and the search for new drug targets. Curr Opin Chem Biol. 2008;12:109–114. [PMC free article] [PubMed]
21. Jia W, Li H, Zhao L, Nicholson JK. Gut microbiota: A potential new territory for drug targeting. Nat Rev Drug Discov. 2008;7:123–129. [PubMed]
22. Rautava S, Kalliomäki M, Isolauri E. New therapeutic strategy for combating the increasing burden of allergic disease: Probiotics—a Nutrition, Allergy, Mucosal Immunology and Intestinal Microbiota (NAMI) Research Group report. J Allergy Clin Immunol. 2005;116:31–37. [PubMed]
23. Ley RE, Turnbaugh PJ, Klein S, Gordon JI. Microbial ecology: Human gut microbes associated with obesity. Nature. 2006;444:1022–1023. [PubMed]
24. Dethlefsen L, Huse S, Sogin ML, Relman DA. The pervasive effects of an antibiotic on the human gut microbiota, as revealed by deep 16S rRNA sequencing. PLoS Biol. 2008;6:e280. 10.1371/journal.pbio.0060280. [PMC free article] [PubMed]
25. Gibson GR, Macfarlane GT, Cummings JH. Sulphate reducing bacteria and hydrogen metabolism in the human large intestine. Gut. 1993;34:437–439. [PMC free article] [PubMed]
26. Duncan SH, et al. Effects of alternative dietary substrates on competition between human colonic bacteria in an anaerobic fermentor system. Appl Environ Microbiol. 2003;69:1136–1142. [PMC free article] [PubMed]
27. Flint HJ, Duncan SH, Scott KP, Louis P. Interactions and competition within the microbial community of the human colon: Links between diet and health. Environ Microbiol. 2007;9:1101–1111. [PubMed]
28. Duncan SH, et al. Reduced dietary intake of carbohydrates by obese subjects results in decreased concentrations of butyrate and butyrate-producing bacteria in feces. Appl Environ Microbiol. 2007;73:1073–1078. [PMC free article] [PubMed]
29. Xu J, et al. A genomic view of the human-Bacteroides thetaiotaomicron symbiosis. Science. 2003;299:2074–2076. [PubMed]
30. Turnbaugh PJ, et al. The effect of diet on the human gut microbiome: A metagenomic analysis in humanized gnotobiotic mice. Science Transl Med. 2009;1:16ra14. [PMC free article] [PubMed]
31. Costello EK, et al. Bacterial community variation in human body habitats across space and time. Science. 2009;326:1694–1697. [PMC free article] [PubMed]
32. Hamady M, Walker JJ, Harris JK, Gold NJ, Knight R. Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex. Nat Methods. 2008;5:235–237. [PMC free article] [PubMed]
33. Li W, Godzik A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. [PubMed]
34. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73:5261–5267. [PMC free article] [PubMed]
35. Caporaso JG, et al. PyNAST: A flexible tool for aligning sequences to a template alignment. Bioinformatics. 2010;26:266–267. [PMC free article] [PubMed]
36. Hamady M, Lozupone C, Knight R. Fast UniFrac: Facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J. 2010;4:17–27. [PMC free article] [PubMed]
37. Saldanha AJ. Java Treeview—extensible visualization of microarray data. Bioinformatics. 2004;20:3246–3248. [PubMed]
38. Gomez-Alvarez V, Teal TK, Schmidt TM. Systematic artifacts in metagenomes from complex microbial communities. ISME J. 2009;3:1314–1317. [PubMed]
39. Huson DH, Auch AF, Qi J, Schuster SC. MEGAN analysis of metagenomic data. Genome Res. 2007;17:377–386. [PMC free article] [PubMed]
40. Gonzlez I, Déjean S, Martin PGP, Baccini A. CCA: An R package to extend canonical correlation analysis. J Stat Softw. 2008;23:1–14.
41. Leurgans S, Moyeed R, Silverman B. Canonical correlation analysis when the data are curves. J R Stat Soc Ser B. 1993;55:725–740.

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...