Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. 2002 Jun 11; 99(12): 7832–7835.
Published online 2002 Jun 4. doi:  10.1073/pnas.122225499
PMCID: PMC122979
Geology, Evolution

Documenting a significant relationship between macroevolutionary origination rates and Phanerozoic pCO2 levels


We show that the rates of diversification of the marine fauna and the levels of atmospheric CO2 have been closely correlated for the past 545 million years. These results, using two of the fundamental databases of the Earth's biota and the Earth's atmospheric composition, respectively, are highly statistically significant (P < 0.001). The strength of the correlation suggests that one or more environmental variables controlling CO2 levels have had a profound impact on evolution throughout the history of metazoan life. Comparing our work with highly significant correlations described by D. H. Rothman [Rothman, D. H. (2001) Proc. Natl. Acad. Sci. USA 98, 4305–4310] between total biological diversity and a measure of stable carbon isotope fractionation, we find that the rates of diversification rather than total diversification correlate with environmental variables, and that the rate of diversification follows the record of CO2 projected by R. A. Berner and Z. Kothavala [Berner, R. A. & Kothavala, Z. (2001) Am. J. Sci. 301, 182–204] more closely than that predicted by Rothman.

A striking correspondence exists between measures of Phanerozoic macroevolution and environmental dynamics, exemplified by the rate of origination of new forms of marine animals and the level of CO2 (Fig. (Fig.1).1). The Pearson correlation coefficient between the two records is 0.66 with significance P < 0.001. Such convergence between biological and geochemical history over so long a period [545 million years (my)] suggests interesting macroevolutionary hypotheses (described below). The macroevolutionary data in Fig. Fig.11 are derived from Sepkoski's (1, 2) extensive database of the records of first and last appearances of fossil marine genera. Our measure of the intensity of diversification of the marine fauna is the genera fractional origination rate, the rate at which new genera appear divided by the number of genera present, a measure also used by Sepkoski (1). The estimate of CO2 levels during the Phanerozoic is Berner and Kothavala's (3) computation based on their model geocarb iii. The two records are quite congruent. For both, high levels persist during the early Paleozoic, falling to low levels in the late Paleozoic. Elevated levels occur again in the early Mesozoic trailing down into the Holocene.

Figure 1
Macroevolutionary origination rate and pCO2. The genera fractional origination, rate Fi [based on Sepkoski (1, 2); see text], is shown by red dots; the five-point-centered moving average of those data are shown by the black solid line. ...

Sepkoski's (1, 2) database has gone through several iterations, most recently and thoroughly when it was increased from family-level data to genus-level data, and it has frequently been used to evaluate hypotheses about the fossil record (e.g., refs. 2, 4, and 5). geocarb iii incorporated information about continental area, elevation, and position; seafloor subduction and spreading rate; the variation of solar radiation; the rise of vascular plants and angiosperms; and shallow- vs. deep-water carbonate deposition rates. Berner and Kothavala (3) showed that CO2 levels computed by geocarb iii are particularly sensitive to the effects of plant-mediated weathering and paleotemperature. None of the geocarb iii source information was derived directly from the diversity of marine fossils, preventing spurious autocorrelation between this database and the data of Sepkoski (1, 2).

Both data sets were considered by their authors to be reasonable representations of their respective underlying systems, albeit with potentially large margins of error. The fossil data are characterized by typical paleontological incompleteness, variable taxonomic practice, and inaccuracies or biases (6), but they derive from direct observation and, according to Raup and Sepkoski (7) and Miller (5), the biological signal from the fossil record overwhelms any noise in the database. Berner (3) showed large margins of error for his calculations but stated that although exact values of CO2 should not be taken literally, the overall trend seemed valid.

Data Analysis

We explain the computation of the observed 0.66 correlation, examine the robustness of that correlation in the context of potential errors in the two data sets, estimate the significance of the correlation, and compare genera fractional extinction rate to genera fractional origination rate.

Sepkoski partitioned the Phanerozoic into 108 stages and substages of average length 5 my. Three numbers are recorded for each (sub)stage [ti,ti+1] (my): Gi, the total number of genera recorded at some time in [ti,ti+1]; Oi, the number of genera that first appeared in [ti,ti+1]; and Ei, the number of genera that last appeared in [ti,ti+1]. For a (sub)stage [ti,ti+1] the genera fractional origination rate, Fi, is (Oi/Gi)/(ti+1ti). With unit 1/my, Fi is defined as the genera fractional origination rate at the midpoint τi = (ti + ti+1)/2 of [ti,ti+1], and the data {(τi,Fi)}equation M1 are plotted as the red dotted line in Fig. Fig.1.1. The solid line in Fig. Fig.11 is the centered five-point moving average, Ai, of the Fi. Berner and Kothavala used a time scale with the Phanerozoic beginning at 570 million years ago and computed 57 estimates of atmospheric pCO2 at 10 my intervals. The ratios, RCO2, of historical pCO2 to recent pCO2 are plotted as blue circles in Fig. Fig.1.1. The shaded region marks RCO2 error margins suggested by Berner and Kothavala (3). Because the time scales used by Sepkoski (1) and Berner and Kothavala (3) were slightly different, the data were aligned by matching the endpoints of the geologic periods (Ca, Cambrian; O, Ordovician … ) and interpolating linearly within periods. Values of RCO2 were then computed by linear interpolation for the appropriate times, τi, in the macroevolutionary database. These RCO2 values have a correlation of 0.66 with the Fi and a correlation of 0.75 with the Ai.

Error-Probe Correlations.

The possible uncertainty of the correlations induced by the potential errors in the data sets was explored by computing the correlations of randomly generated RCO2 profiles bound only by the error margins shown in Fig. Fig.11 and randomly generated origination profiles bound by minus and plus 50% error margins in genera fractional origination rates. We created a hypothetical RCO2 profile by selecting for each τi a value of RCO2 from the uniform distribution between the lower and upper error margins of RCO2 at τi. Similarly, a hypothetical origination profile was created by selecting for each τi an origination value from a uniform distribution between 0.5Fi and 1.5Fi, which allowed for a potential 50% error in the genera fractional origination rate Fi. The correlation between the two hypothetical profiles was computed. Ten thousand such error-probe correlations were computed, and a histogram of the correlations is shown as the shaded regions in Fig. Fig.2.2. All of the error-probe correlations fell in the interval 0.33–0.75. Hence, the overall trends in the RCO2 profile and the genera fractional origination rate profile are sufficiently similar to force a correlation of at least 0.33 in profiles that are only loosely bound to the measured profiles by large error margins.

Figure 2
Histograms showing correlations between random profiles. The shaded histogram shows the distribution of error-probe correlations between 10,000 pairs of hypothetical profiles of genera fractional origination rate that explore 50% error margins ...

Independence-Probe Correlations.

We used replacement sampling to test the null hypothesis that genera fractional origination rates were independent of RCO2 levels. The usual t test of significance of correlations between two variables assumes that at least one of the variables is normally distributed, an assumption not upheld here. The observed distribution of genera fractional origination rates was used to generate a hypothetical origination rate profile that was independent of RCO2 values. For each time τi, a value Ri was randomly selected from {Fj}equation M2; the profile was the collection {(τi,Ri)}equation M3. The correlation between the observed RCO2 profile and the hypothetical origination profile was then computed. Ten thousand such independence-probe correlations were computed, and the unshaded histogram in Fig. Fig.22 shows the distribution of these correlations. All of the independence-probe correlations fell in the interval −0.38 to 0.34; only 3 of these correlations were greater than 0.33, the smallest of the error-probe correlations of the previous paragraph, and one was less than −0.33. For a two-sided test of independence, given a correlation of 0.33 between two profiles, we would reject the null hypothesis that the profiles were independent with a confidence of 9,996/10,000 = 0.9996 and would assign a significance to a correlation of 0.33 of P = 0.0004. Therefore, a conservative estimate of the significance of our 0.66 correlation between the observed RCO2 and fractional rate of genera origination distributions is P < 0.001.

Genera Fractional Extinction Rate.

Sepkoski's data for the number of genera extinctions {(Ei)}equation M4 during (sub)stages has a correlation of 0.61 with the number of genera originations {(Oi)}equation M5. A genera fractional extinction rate profile may be computed as (Ei/Gi)/(ti+1ti) and the graph (not shown) also follows the profile of RCO2. The statistics of this curve are similar to but less robust than those of the origination curve. The correlation between genera fractional extinction rate and RCO2 is 0.64 (P < 0.01); the correlation of the five-point moving average of genera fractional extinction rate and RCO2 is 0.78.

Comparison with a Previous Study

Rothman (8) has related similarly large-scale evolutionary and geological records. He showed that the diversities of marine animals and land plants have highly significant (P < 0.001) negative correlations with a measure of stable carbon isotope fractionation between total organic carbon and sedimentary carbonates over the last 400 my. That the diversity of marine animals should correlate so strongly with carbon isotope fractionation is surprising. It was explained by Rothman in the following way. Increasing plant diversity beginning in the Silurian (425 million years ago) led to increasing weathering of rocks that had two effects: atmospheric CO2 levels decreased, causing a decrease in carbon isotope fractionation in marine deposits; simultaneously, critical nutrients such as phosphorus were released to the marine environment, causing an increase in marine animal diversity. Thus, carbon isotope fractionation decreased and marine animal diversity increased.

Rothman used Sepkoski's data that we have used and defined marine diversity at the end of a (sub)stage as the total abundance during the (sub)stage diminished by the number of genera with last record during the (sub)stage (GiEi in the notation above). Land plant diversity is diversity at the family level from Benton (9). The carbon isotope fractionation data were assembled by Hayes et al. (10) from analyses of the abundance of 13C in marine organic matter and in sedimentary carbonates. The same isotope data were part of the source material for geocarb iii.

Rothman extended his model ultimately to estimate a linear relation RCO2(t) = a − bn(t) for the last 370 my, where t is time, n(t) is marine animal diversity, and a and b are constants. His estimate of RCO2(t) reasonably tracks the geocarb iii estimate after about 200 million years ago but is almost constant during the Carboniferous through the Triassic, a finding very different from that computed by geocarb iii. In Fig. Fig.1,1, RCO2 drops from about 6 to just above 1 and then rises to about 5 during the Carboniferous through the Triassic. During this time, the genera fractional origination rate shows a better comparison to geocarb iii than to Rothman's model. In a commentary on Rothman's article, Falkowski and Rosenthal (11) concluded that Rothman's correlations are significant but also should not be considered causal relationships. They argued that tectonics mediated the geochemical signature and the biological processes.

Analyses by Berner and Kothavala (3) showed that estimates of RCO2 are sensitive to changing sets of factors over the Phanerozoic. They emphasized the importance of including all factors affecting CO2 when modeling the long-term carbon cycle. They showed, for example, that over the Mesozoic and Cenozoic, the effect of the intensities of weathering by different types of plants and the effect of the proportion of plants that responds to changes in atmospheric CO2 are potentially more influential on RCO2 values than the effect of the global degassing. The observed correlation between RCO2 computed by geocarb iii and the genera fractional origination rate demonstrates that geocarb iii encompasses mechanisms that relate the geological, geochemical, and terrestrial plant records to the marine animal record.


That the two overall trends of genera origination and RCO2 should be so similar from disparate sources is remarkable. Conceivably there are similar systematic biases in the two databases. We propose, however, hypotheses linking macroevolution and paleoenvironment. The simplest hypothesis is that macroevolution is directly affected by CO2 levels. Alternatively, paleotemperature may be an intermediary between the two systems. Global warming is often associated with high CO2 levels, and the two most extensive and long-lasting glaciations during the Phanerozoic occurred at times of low CO2 levels (12). One might even hypothesize that high temperatures directly increase marine diversification or that low temperatures and specifically glaciations inhibit marine diversification, a variant on an idea of Stanley (13). Additionally, one might pose a hypothesis that some factors that enhanced plant diversification inhibited marine diversification. For example, the downward trend in CO2 from the Early Devonian to the Early Permian was primarily due to the rise of vascular plants (12) and was accompanied by a drop in the fractional origination rate of genera. Yet another hypothesis is that enhanced CO2 levels may be associated with increased sea-floor spreading rates that could encourage biological diversification by isolating faunas. We anticipate that refinements of these hypotheses and additional hypotheses may be used to show that the paleoenvironment guided much of macroevolutionary development.

Finally, the correspondence between geochemical and biological history documented here and the two instances documented by Rothman (8) strongly suggest that the overall controls on most of the macroevolutionary record are environmental variables controlling CO2 levels. We have found that CO2 levels correlate with the dynamics of the origination and extinction of genera, whereas Rothman showed a correlation between CO2 levels and total diversity. In the first case, CO2 levels influence diversity dynamics whereas in the second case CO2 levels prescribe the absolute diversity levels. The two are ultimately related but imply different mechanisms or time scales.


We are indebted to innumerable scientists whose work contributed to the data shown, and specifically to John J. Sepkoski, Jr., for his work on the marine fossil record and Robert A. Berner for his computation of historical CO2 values. We thank Robert A. Berner, Roger Kaesler, Roy Plotnick, and Linda Young for helpful communications, and Doug Jones and Matt Saltzman for their helpful comments on the article. We also thank the Department of Geology of the Univ. of Kansas and the National Science Foundation for financial support.


RCO2the ratio of atmospheric partial pressure of CO2 at a time in the past to that of the present
mymillion years


1. Sepkoski J J., Jr Philos Trans R Soc London B. 1998;353:315–326. [PMC free article] [PubMed]
2. Plotnick R E, Sepkoski J J., Jr Paleobiology. 2001;27:126–139.
3. Berner R A, Kothavala Z. Am J Sci. 2001;301:182–204.
4. Sepkoski J J., Jr Paleobiology. 1993;19:43–51. [PubMed]
5. Miller A I. Paleobiology. 2000;26, Suppl. 4:53–73.
6. Peters S E, Foote M. Paleobiology. 2001;27:583–601.
7. Raup D M, Sepkoski J J., Jr Science. 1982;215:1501–1503. [PubMed]
8. Rothman D H. Proc Natl Acad Sci USA. 2001;98:4305–4310. [PMC free article] [PubMed]
9. Benton M J, editor. The Fossil Record 2. London: Chapman & Hall; 1993.
10. Hayes J M, Strauss H, Kaufman A J. Chem Geol. 1999;161:103–125.
11. Falkowski P G, Rosenthal Y. Proc Natl Acad Sci USA. 2001;98:4290–4292. [PMC free article] [PubMed]
12. Berner R A. Philos Trans R Soc London B. 1998;353:75–82.
13. Stanley S M. Paleobiology. 1990;16:401–414.

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Compound
    PubChem chemical compound records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records. Multiple substance records may contribute to the PubChem compound record.
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem chemical substance records that cite the current articles. These references are taken from those provided on submitted PubChem chemical substance records.

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...