Air Pollution and Birth Weight in Connecticut and Massachusetts

In their paper, Bell et al. (2007) examined the effects of ambient air pollutants on birth weight in children born in Connecticut and Massachusetts between 1999 and 2002. The study is the largest among the studies conducted in the United States and has provided evidence that, even with pollutant levels that met the air quality standards, significant reductions in birth weight occurred. 
 
The study findings are in line with other articles; however, there are several concerns that need further attention. Infants with preterm birth (born at 32–37 weeks of gestation) were included in the analysis, and this may have affected the overall and the third trimester–specific results. Although these children comprised only 6.7% of the sample, their birth weights were greatly affected (i.e., mean birth weight was 585–1,050 g less than those born at 39–40 weeks of gestation). 
 
In the discussion, Bell et al. (2007) pointed out that the effect of air pollutants on birth weight could be mediated by the effect of these pollutants on preterm birth and/or on fetal growth. It is unclear whether the effects observed in this study were mediated by one or both competing mechanisms. Bell et al.’s Table 6 compared studies that tested pollutant effects mediated only by fetal growth (Basu et al. 2004; Parker et al. 2005; Ritz and Yu 1999; Salam et al. 2005), only by preterm birth (Rogers and Dunlop 2006; Rogers et al. 2000), or by both mechanisms separately (Wilhelm and Ritz 2003, 2005; Woodruff et al. 2003). Bell et al. (2007) could have further benefited readers if they had evaluated pollutant effects on intrauterine growth restriction (IUGR), a metric of pathological fetal growth. Although IUGR is conventionally defined based on the bottom 10th percentile of the birth weight distribution, a better approach would be to define it by < 15th percentile of predicted birth weight based on gestational age, infant sex, and maternal race. 
 
Pollutant data were not available from all counties. As such, different counties contributed to different exposure effects. This makes the comparison between single and two-pollutant models difficult. In Figure 2 (Bell et al. 2007), for example, the effect of particulate matter < 2.5 μm in aerodynamic diameter (PM2.5; measured in 13 counties) seems to have been significantly changed when adjusted for carbon monoxide (measured in 7 counties). It would have been more meaningful if sensitivity analyses were conducted comparing one and two-pollutant models restricting to those counties that had data on all exposures. 
 
The detrimental effect of PM2.5 on birth weight was significantly larger in black infants than in white infants. Hispanic white infants were at increased risk of low birth weight in a similar population (Maisonet et al. 2001). Thus, combining non-Hispanic and Hispanic whites into one group, it is not possible to determine whether Hispanic infants were disproportionately affected by ambient pollutants. There is significant heterogeneity in the percentages of African-American and Hispanic white population by county (U.S. Census Bureau 2007). For example, within the state of Massachusetts, 25% of residents were African American and about 18% were Hispanic in Suffolk, whereas about 2% each were African American and Hispanic in Barnstable. In addition, in counties with higher proportions of African-American and Hispanic populations, the percentages living in poverty were also much higher. Although Bell et al. (2007) acknowledged within-county heterogeneity, questions remain about whether significant heterogeneity in effects existed across counties. 
 
In summary, the results would have been more appealing if Bell et al. (2007) could provide data to show whether pollutant effects were mediated by preterm birth and/or affected fetal growth, and whether these effects were greater in Hispanics, in mothers who smoked, and in counties with a higher proportion of people living in poverty.

In their paper,  examined the effects of ambient air pollutants on birth weight in children born in Connecticut andMassachusetts between 1999 and2002. The study is the largest among the studies conducted in the United States and has provided evidence that, even with pollutant levels that met the air quality standards, significant reductions in birth weight occurred.
The study findings are in line with other articles; however, there are several concerns that need further attention. Infants with preterm birth (born at 32-37 weeks of gestation) were included in the analysis, and this may have affected the overall and the third trimester-specific results. Although these children comprised only 6.7% of the sample, their birth weights were greatly affected (i.e., mean birth weight was 585-1,050 g less than those born at 39-40 weeks of gestation).
In the discussion,  pointed out that the effect of air pollutants on birth weight could be mediated by the effect of these pollutants on preterm birth and/or on fetal growth. It is unclear whether the effects observed in this study were mediated by one or both competing mechanisms. Bell et al.'s Table 6 compared studies that tested pollutant effects mediated only by fetal growth Parker et al. 2005;Ritz and Yu 1999;Salam et al. 2005), only by preterm birth (Rogers and Dunlop 2006;Rogers et al. 2000), or by both mechanisms separately (Wilhelm andRitz 2003, 2005;Woodruff et al. 2003).  could have further benefited readers if they had evaluated pollutant effects on intrauterine growth restriction (IUGR), a metric of pathological fetal growth. Although IUGR is conventionally defined based on the bottom 10th percentile of the birth weight distribution, a better approach would be to define it by < 15th percentile of predicted birth weight based on gestational age, infant sex, and maternal race.
Pollutant data were not available from all counties. As such, different counties contributed to different exposure effects. This makes the comparison between single and two-pollutant models difficult. In Figure 2 , for example, the effect of particulate matter < 2.5 µm in aerodynamic diameter (PM 2.5 ; measured in 13 counties) seems to have been significantly changed when adjusted for carbon monoxide (measured in 7 counties). It would have been more meaningful if sensitivity analyses were conducted comparing one and two-pollutant models restricting to those counties that had data on all exposures.
The detrimental effect of PM 2.5 on birth weight was significantly larger in black infants than in white infants. Hispanic white infants were at increased risk of low birth weight in a similar population . Thus, combining non-Hispanic and Hispanic whites into one group, it is not possible to determine whether Hispanic infants were disproportionately affected by ambient pollutants. There is significant heterogeneity in the percentages of African-American and Hispanic white population by county (U.S. Census Bureau 2007). For example, within the state of Massachusetts, 25% of residents were African American and about 18% were Hispanic in Suffolk, whereas about 2% each were African American and Hispanic in Barnstable. In addition, in counties with higher proportions of African-American and Hispanic populations, the percentages living in poverty were also much higher. Although  acknowledged withincounty heterogeneity, questions remain about whether significant heterogeneity in effects existed across counties.
In summary, the results would have been more appealing if  could provide data to show whether pollutant effects were mediated by preterm birth and/or affected fetal growth, and whether these effects were greater in Hispanics, in mothers who smoked, and in counties with a higher proportion of people living in poverty. In our recent study ) we identified associations between air pollution and low birth weight in Connecticut and Massachusetts based on 358,504 births from 1999 to 2002. Salam raises several important concerns about potential limitations in the analysis and interpretation of our results. In particular, he discusses gestational period, effects by race, confounding by co-pollutants, and competing biological mechanisms among other issues. We conducted several sensitivity analyses as suggested by Salam in his letter, and detailed results are available upon request. The original study considered confounding by co-pollutants for pairs of pollutants that were not highly correlated, and found that model results were robust to adjustment by other pollutants. Salam correctly notes that pollutant data were not available from all counties; therefore, a given observation may have data for some exposure variables and not others. We performed a new analysis comparing results based only on observations with data for the two pollutants considered for gestational exposure. For example, we calculated the association between particulate matter < 2.5 µm in aerodynamic diameter (PM 2.5 ) and birth weight, adjusted by carbon monoxide, and the association between PM 2.5 and birth weight not adjusted by CO, but including only the subset of observations with CO data available. The new analysis did not change the results or interpretation.
In our initial analysis we omitted births with gestational length < 32 weeks or > 44 weeks, and adjusted for gestational length at 2-week intervals. As noted in our article , births with gestational length of 32-36 weeks accounted for 6.7% of the original observations. Salam proposed analysis of gestational and thirdtrimester exposure restricted to observations with 37-44 weeks gestation. We performed this analysis and generated effect estimates for first-and second-trimester exposure as well. Effects estimates based on the subset analysis (37-44 weeks) were very similar to those from the original analysis.
Salam notes that combining non-Hispanic and Hispanic whites, as done in our study, does not allow for distinction between Hispanics and non-Hispanics, and that race may be associated with socioeconomic status. Although we controlled for race and for socioeconomic status (through mother's education), we agree that the analysis has limitations. Research of effects by race is further complicated by distinction of non-Hispanic black versus Hispanic black and by other subdivisions of racial and ethnic categories (e.g., Mexican vs. Cuban heritage), as well as multiracial infants. Other restrictions may arise from lack of sufficient sample size to investigate various racial categories. To date, few low birth weight and air pollution studies have specifically investigated race (e.g., ), although others have included race as a covariate or restricted observations by race.
Although Salam's letter is directed at an individual study, it highlights some of the challenges of air pollution and pregnancy outcome studies more broadly. Many epidemiologic studies evaluate exposure from monitoring networks implemented for regulatory compliance, and not all areas have monitors. Multipollutant analysis is additionally complex because of the high correlation of some pollutants and by the variation in the chemical composition of particulate matter. Factor analysis and other source apportionment techniques (e.g., Thurston et al. 2005) that have been used to investigate the association between particles and other health outcomes may be used to estimate exposure based on sources for birth outcomes research. A recent study of 1,016 births in the Munich, Germany, metropolitan area assessed exposure to traffic-related pollution accounting for PM 2.5 and nitrogen dioxide levels, land use, road characteristics, and population density (Slama et al. 2007).
A critical question is the biological mechanism through which air pollution affects fetal growth, and the potential competing mechanisms of impacts on preterm delivery and fetal growth, as mentioned by Salam. He also notes that other useful results would include analyses of mothers who smoke and of counties with a higher proportion of people living in poverty. In our study ), we adjusted for mother's smoking and education, as indicators of socioeconomic status, in all models; however, we did not investigate the effect modification of smoking or economic conditions. Although such a study would be informative, our data set is not well suited to this analysis because of the lack of extensive data on smoking or socioeconomic status. More detailed and accurate information on these variables may be available from cohort studies.
Limitations of the current scientific literature on birth outcomes and air pollution prompted two recent international workshops, the International Workshop on Air Pollution and Human Reproduction in Munich in May 2007, and the Methodological Issues in Studies of Air Pollution and Perinatal Outcomes Workshop in Mexico City in September 2007. These workshops explored exposure assessment, confounding and effect modification, the relevant window of exposure before or during pregnancy, biological mechanisms, spatial analysis, and the public health implications of observed associations. Reports from the workshops are forthcoming.   studied a group of children (n = 507) who were exposed by inhalation to elemental mercury (Hg 0 ) from dental amalgam fillings. In the study, 253 subjects were exposed, whereas the remaining 254 children, the control group, were exposed to composite resin. We consider the experimental design of their study to be adequate, but we do have questions about their methods of data handling and interpretation.

Contribution of Dental
For example, we do not understand why instead of always using creatinine-adjusted Hg levels, they used-in some instancesunadjusted Hg levels. In fact, there is continuous alternation and exchange between the two biological concepts (i.e., between the unadjusted and the adjusted concentrations). There are at least three well-grounded and well-known reasons that creatinine adjustment is essential: a) urinary creatinine accounts for variations in 24-hr excretion (Aito et al. 1983); b) urinary creatinine adjustment reportedly reflects Hg blood levels (Smith et al. 1970) and possibly Hg body burden; and c) in the light of established knowledge, Hg blood levels reflect recent exposure (Piotrowski et al. 1975).
Accordingly, the lack of significance between the Hg levels (not adjusted for creatinine) of the amalgam and the control subgroups at year 7, the final year of the study by , is probably a bias that is indicated by the disappearance of overlapping if creatinine adjustment had been performed, as suggested by Aitio et al. (1983) as long as 25 years ago. Also, because no adjusted data were reported for male and female levels, the impact of such an adjustment cannot be conjectured by the reader. Subsequently this prevents accurate evaluation of the Hg level trend over the years.
It should be pointed out that the data of  do not allow us to extrapolate whether or not the exposed subgroup is in the steady state, because this condition depends on the time lag between urine collection and the last amalgam treatment(s).
Environmental Health Perspectives • VOLUME 116 | NUMBER 3 | March 2008 This limitation prevents an accurate interpretation of the decrease in urinary Hg levels over years. Geller (1976) reported that Hg sulfide can coat Hg 0 , thereby slowing down the release of Hg vapor. Although no specific study has determined whether this is true for amalgams, we cannot exclude that Hg oxidation may yield Hg sulfide on the amalgam surface.  speculated about the decrease in Hg urinary excretion over years, but they did not consider the possibility of sulfide formation. Moreover, they did not explain the decrease in Hg levels over time after year 2 but simply stated that "the rate of urinary [Hg] excretion exceeds the rate of [Hg] exposure from dental amalgam." The formation of a thin film of Hg sulfide on amalgam surfaces could be an explanation, especially since the Hg body burdenand consequently Hg urinary levels-may be either in the steady state or at an increasing elimination rate because of the addition of new fillings.
Furthermore, we feel that the use of the term "dose-effect relationship" by  is questionable. Also, it is not clear if the term "dose" refers to the number of additional amalgam fillings over the years or to the difference between means. Also, "effect" has a completely different meaning in toxicology. In this case, another term should be used to more accurately indicate the difference in two urinary Hg levels. In our opinion, "differential dose minus followup years" would be more appropriate in the text than "dose effect."  stated that in children who received "up to 9 initial amalgam fillings, urinary Hg returned to pretreatment value within one year," but this statement is not clear because this trend applies only to children who received 0-4 amalgam fillings at baseline but not to the group that received 5-9 [ Figure 4; ].
Finally,  omitted error bars from their Figure 4; SE or SD could have been easily calculated by the theory of error propagation and would probably have addressed the discussion more accurately, or at least would have tempered some conclusions, especially with regard to confirmation of the "whole-body biological half-time of Hg on the order of 60-70 days." This half-time is correct but there is a large margin of uncertainty based on the experimental data.
In conclusion, although  used a well-structured experimental design, their conclusions are not accurate because of their handling of the experimental results and their use of basic toxicology terminology. doi:10.1289/ehp.11013R We are pleased to respond to Iavicoli and Carelli's comments on our article .

Contribution of Dental Amalgam to Urinary Mercury Excretion in Children: Woods et al. Respond
Iavicoli and Carelli question our use of creatinine adjustment; in previous studies, we addressed the pros and cons of this issue in depth (Heyer et al. 2007;Martin et al. 1996;Woods et al. 1998). In terms of the present study , instead of advocating one approach over the other, we presented data, where appropriate (e.g., Figure 2), as both adjusted and unadjusted measures to provide the reader access to both sets of results. Iavicoli and Carelli note only a slight difference in Figure 2 between comparisons of adjusted and unadjusted urinary Hg levels at year 7; in our article we explained the higher variability (not bias) in the unadjusted values. Because creatinine adjustment did not otherwise alter the urinary Hg findings in the study, as stated in our Figure 3  ), we did not present adjusted values for data described in Figures 3 or 4. Iavicoli and Carelli's comments about blood Hg levels are not relevant because we did not measure blood Hg concentrations in this study.
Regarding Iavicoli and Carelli's comment on "steady state," data presented in our Figure 4  ) allow the reader to discern the influence of additional amalgam treatment on Hg body burden over time, as inferred from annual urinary Hg levels. A comprehensive pharmacokinetic evaluation of Hg body burden was neither intended nor within the scope of this study.
Iavicoli and Carelli speculate that Hg sulfide on teeth surfaces might have affected observed changes in urinary Hg levels; however, others (Brune 1986;Brune and Evje 1985;Gebel and Dunkelberg 1996) have clearly shown that sulfide (and other oxidation) layers are continuously removed by the effects of mastication, as well as by hot foods and liquids. These layers do regenerate but are in a constant state of flux. Because all of the amalgam used in this study ) was of a single formulation, there would have been no variation in the tendency to form sulfide layers from amalgam treatment. Additionally, although there may have been some variation in oral pH among study subjects that could have influenced this process, Hg elimination still occurs at a relatively constant state. Therefore, we do not consider Hg sulfide film formation to be a significant factor in urinary Hg excretion over time.
Iavicoli and Carelli question our use of "dose effect" to describe the relationship of amount of amalgam treatment received with urinary Hg concentration. However, we consider this depiction appropriate.
Finally, Iavicoli and Carelli point out that the trend that children who received up to nine initial amalgam fillings but no subsequent treatment returned to pretreatment values within 1 year is not clear. This statement should have been "within 1 additional year" (i.e., by year 3 of follow-up). Urinary Hg levels were highest approximately 2 years after initial amalgam treatment for those with no subsequent treatment; those with ≥ 10 fillings at initial treatment and no subsequent treatment took > 3 years (approximately 5 years) to return to pretreatment levels. For Figure 4 ( ), we obtained confidence intervals, but they were not included because of the clutter they created in the figures. Thus, the statements refer to trends and not statistical significance. We thank Healey et al. for their thoughtful comments on our papers in the mini-monograph on adult lead exposure  with the points they raised concerning uncertainties in the relation of cumulative blood lead index (CBLI) with tibia lead, the need to address the possibility that the slope of the relation may not be constant across the range of tibia lead values, and uncertainty about how sex may influence the relation, as most data were derived from studies of men. Given the changing metabolism of bone across the life span, age must ultimately be factored in as well.

Relationship between Tibia Lead and Cumulative Blood
Healey et al. posit that the relation of CBLI with tibia lead may be nonlinear by presenting summary data from eight studies; they show that the estimated slope of the CBLI and tibia lead relation is relatively low in studies with lower mean tibia lead levels, whereas the estimated slope appears to be higher in studies with higher mean tibia lead levels. A problem with this argument is that it is prone to the ecologic fallacy of using summary data of CBLI and mean tibia lead from groups across studies to make inferences in individuals about the relation of CBLI with tibia lead. All of the literature evaluating relations of CBLI with tibia lead is based on measurements in only approximately 500 subjects. A rigorous assessment for possible nonlinearities in such a relation would require a pooled analysis of the original data, not an ecologic analysis of the summary results across studies (Lanphear et al. 1998). Several statistical techniques could then be used to evaluate possible departures from linearity in the CBLI versus tibia lead relation using the individual level data.
Concerning the range of slopes we reported, we wrote "Each study also reported sample size and the SE of the slope, which, across the studies, ranged from 0.028 to 0.067" ). Healey et al. correctly report that the slopes ranged from 0.022 to 0.10 µg/g per µg-years/dL across the eight studies they included. This discrepancy is easily explained. Concerning the high end of their range, we chose not to use the Armstrong et al. slope estimates for 1983 or 1988 (both 0.10, based on 15 and 11 subjects, respectively), instead relying on their estimate (0.052) for the 11 subjects for both 1983 and 1988 ). Concerning the low end of their range (0.022),  did not present an SE of the slope, which is needed, along with the slope estimate, for use in a meta-analysis.
Given the ecologic fallacy issue discussed above, we respectfully disagree with the recommendation of Healey et al. that we should rely only on the two studies that reported the smallest slopes for the relation of CBLI with tibia lead ) simply because, as they note, these studies had average tibia lead levels closest to the 15 µg/g tibia lead limit we proposed . In addition, we believe that relying on only two relatively small studies does not provide sufficient margin of safety for lead-exposed workers to accept Healey et al.'s recommendation at this time. Because our primary goal is preventing departures from health associated with cumulative lead dose in adults exposed to lead, we instead choose to rely on an estimate derived from a weighted mean across studies. At the present time, we believe we are justified in standing by our recommendation of a CBLI of 200-400 µg-years/dL.

ERRATA
In the February Spheres of Influence article [Environ Health Perspect 116:A78-A81 (2008)], the first sentence of the second paragraph under the subhead "Community Response" should have read "In the 1990s, just a few groups such as the Sierra Club, the Environmental Health Coalition, the Center for Community Action and Environmental Justice, and homeowners near the ports were focused on the effects of the global supply chain." In the February Science Selection article "Exposure Under Pressure: Lead Linked to Release of Cortisol in Children" [Environ Health Perspect 116:A83 (2008)], the range of the known prenatal blood lead levels was ≤ 1.0-6.3 µg/dL. EHP regrets the errors.