Benchmark concentrations for methylmercury obtained from the Seychelles Child Development Study.

Methylmercury is a neurotoxin at high exposures, and the developing fetus is particularly susceptible. Because exposure to methylmercury is primarily through fish, concern has been expressed that the consumption of fish by pregnant women could adversely affect their fetuses. The reference dose for methylmercury established by the U.S. Environmental Protection Agency was based on a benchmark analysis of data from a poisoning episode in Iraq in which mothers consumed seed grain treated with methylmercury during pregnancy. However, exposures in this study were short term and at much higher levels than those that result from fish consumption. In contrast, the Agency for Toxic Substances and Disease Registry (ATSDR) based its proposed minimal risk level on a no-observed-adverse-effect level (NOAEL) derived from neurologic testing of children in the Seychelles Islands, where fish is an important dietary staple. Because no adverse effects from mercury were seen in the Seychelles study, the ATSDR considered the mean exposure in the study to be a NOAEL. However, a mean exposure may not be a good indicator of a no-effect exposure level. To provide an alternative basis for deriving an appropriate human exposure level from the Seychelles study, we conducted a benchmark analysis on these data. Our analysis included responses from batteries of neurologic tests applied to children at 6, 19, 29, and 66 months of age. We also analyzed developmental milestones (age first walked and first talked). We explored a number of dose-response models, sets of covariates to include in the models, and definitions of background response. Our analysis also involved modeling responses expressed as both continuous and quantal data. The most reliable analyses were considered to be represented by 144 calculated lower statistical bounds on the benchmark dose (BMDLs; the lower statistical bound on maternal mercury hair level corresponding to an increase of 0.1 in the probability of an adverse response) derived from the modeling of continuous responses. The average value of the BMDL in these 144 analyses was 25 ppm mercury in maternal hair, with a range of 19 to 30 ppm. ImagesFigure 1

Methylmercury is a neurotoxin at high exposures, and the developing fetus is particularly susceptible. Because exposure to methyimercury is primarily through fish, concern has been expressed that the consumption of fish by pregnant women could adversely affect their fetuses. The reference dose for methyimercury establshed by the U.S. Environmental Protection Agency was based on a benchmark analysis of data from a poisoning episode in Iraq in which mothers consumed seed grain treated with methylimercury during pregnancy. However, exposures in this study were short term and at much higher levels than those that result from fish consumption. In contrast, the Agency for Toxic Substances and Disease Registry (ATSDR) based its proposed minimal risk level on a no-obsrved-adverse-effec level (NOAEL) derived from neurologic testing of children in the Seychelles Islands, where fish is an important dietary staple. Because no adverse effects from mercury were seen in the Seychelles study, the ATSDR considered the mean exposure in the study to be a NOAEL. However, a mean exposure may not be a good indicator of a no-effect exposure level. To provide an alternative basis for deriving an appropriate human exposure level from the Seychelles study, we conducted a bendcmark analysis on these data. Our analysis induded responses from batteries of neurologc tests applied to children at 6,19,29, and 66 months of age. We also analyzed developmental milestones (age first walked and first talked). We esplored a number of dose-response models, sets of covariates to indude in the models, and definitions of backound response. Our analysis also involved modeling responses exprssed as both continuous and quantal data. The most reliable analyse were considered to be represented by 144 calculated lower statistical bounds on the bemar dose (BMDLs; the lower statistical bound on maternal mercury hair level corresponding to an increase of 0.1 in the probability of an adverse response) derived from themodl.ing of continuous responses. The average value of the BMDL in these 144 anlyses was 25 ppm mercuy in matenal hair, with a range of 19 to 30 ppm. Key works: benchmark dose, child deelopment, ftl exposure, fish, mercury, methyimercury, neurologic development, Seychelles. Eniron Healtb Perspect 108:257-263 (2000). [Online 3 February 2000] hapt//ehpnetl.niehs.nibgov/docs/2OOO/lO8p257-263cumpabstrac.thm1 Methylmercury is a neurotoxin that at high exposures can cause microcephaly, cerebral palsy, seizures, mental retardation, and death (12). Several poisoning episodes have confirmed that the fetal brain is particularly susceptible to methylmercury. The most severe effects were seen in Minamata and Niigata, Japan (3), where children were born with severe cerebral palsy in a population that consumed seafood contaminated with methylmercury from an industrial operation. Because exposure to methylmercury is primarily through the consumption of fish, concern has been raised regarding the health of children whose mothers consumed fish during pregnancy.
A reference dose (RfD) for methylmercury of 0.1 pg/kg/day was established by the U.S. Environmental Protection Agency (EPA) (4). This RfD was based on a benchmark analysis of data from a study of 81 Iraqi children whose mothers consumed seed grain treated with methylmercury as a fungicide (5). In this study an Iraqi mother's peak hair mercury concentration during pregnancy was associated with her child's neurodevelopment defined by the age at which the child first walked, first talked, and by a score derived from a neurologic examination. Although this study provided insight into exposure levels that can cause neurologic effects in the developing fetus, it has several features that may limit its usefulness in determining a dose response for chronic environmental exposure to methylmercury, such as might occur from eating fish. The ages at which the Iraqi children first walked or first talked were based on family members' memories obtained at a mean age of 19 months after delivery, and were not known precisely. Even birth dates were not recorded precisely but were established in relation to religious holidays, seasons, etc. There was limited information available on the child's environment, which could have influenced his or her neurologic development. The measures of effect in the study (late walking, late talking, and neurologic score) were relatively nonspecific to neurologic deficit and may not have been sensitive to methylmercury. Perhaps most important, the exposures in the Iraqi study occurred at a high level for a limited period of time, and consequently were not typical of lower level and often chronic exposures associated with fish consumption. In contrast to the EPA's use of the Iraqi study, the Agency for Toxic Substances and Disease Registry (ATSDR) (6) developed its proposed minimum risk level (MRL) of 0.3 pg/kg/day for methylmercury from the Seychelles Child Development data (7,8). The Seychelles study examined longitudinally the neurologic and psychologic responses of 708 children whose mothers consumed a relatively large amount of fish during pregnancy (7,8). The children were given a range of sensitive tests of neurologic and psychologic development. Information was available on a number of health, social, and demographic variables that could have been associated with test performance. Most important, exposures in this study were through fish, which is the route of concern in human populations. Because there was no association between methylmercury exposure and neurologic or psychologic outcomes, ATSDR investigators considered the mean maternal hair concentration of 15.3 ppm from the Seychelles group with the highest exposure to be a no-observed-adverse-effect level (NOAEL), and used that NOAEL as its basis for deriving the MRL. In an alternative derivation of its MRL, ATSDR investigators considered the mean maternal hair concentration of 6.8 ppm from the entire Seychelles cohort at 66 months to be a NOAEL.
The NOAEL approach applied by the ATSDR is the traditional method for deriving an exposure judged to be without appreciable risk of noncancer health effects to humans (e.g., an RfD or an MRL). This approach is most appropriate for experimental studies in which animals are placed in a few discrete dose groups. Determination of a NOAEL is more problematic in epidemiologic studies such as the Iraqi or Seychelles studies, in which exposures do not assume a few discrete values. When an effect is seen in an epidemiologic study, a mean exposure has often been assumed to be a lowest-observedadverse-effect level (LOAEL). However, the analysis may have demonstrated only that an effect occurred somewhere within the range of exposures in the study-a range that might span several orders of magnitude.
Similarly, when an effect is not seen in an epidemiologic study, a mean exposure is sometimes assumed to be a NOAEL, as was the case in the use of the Seychelles study by ATSDR investigators. However, the mean exposure may not be a good indicator of the no-effect exposure. For example, adding a large number of individuals to the cohort with very low exposures could dramatically affect the mean exposure, but would add little information on the appropriate value for a NOAEL.
The benchmark approach applied to the Iraqi data by EPA investigators provides an alternative to the NOAEL in determining an RfD (9)(10)(11). The benchmark dose (BMD) is the dose that corresponds to a specified level of additional response called the benchmark response (BMR). The BMD is calculated by fitting a mathematical dose-response model to the data. A lower statistical confidence bound on the BMD (BMDL) replaces the NOAEL in the calculation of an RfD. In addition to methylmercury, as documented in the EPA Integrated Risk Information Service (IRIS database) (12), the EPA has used this approach to determine RfDs or reference air concentrations for a number of other chemicals.
The BMD method has several advantages over the NOAEL method, including making better use of dose-response information and reflecting sample size more appropriately (9,11). When applied to epidemiologic data, the BMD allows for consideration of the dose response over the entire exposure range rather than assuming that the mean exposure is either a NOAEL or LOAEL.
Another advantage of the BMD method is that a BMDL can be calculated from negative data (data in which no statistically significant dose-related trend is present). With negative data, the point estimate of the dose-response trend may be positive (in the direction of an adverse effect), zero, or negative (in the direction of a beneficial effect), although any deviation from zero may reflect only random variation and not a real effect of exposure. If the point estimate of the trend is zero or negative, the point estimate of the BMD will be infinite (undefined), whereas if the point estimate of the trend is positive, the point estimate of the BMD will be finite. A statistical upper bound on the BMD obtained from negative data will be infinite; otherwise the data would, by definition, not be negative. However, a BMDL derived from negative data will be a finite number as long as the data do not demonstrate a statistically significant negative trend. When data are negative, it is possible that there is no effect of treatment, in which case a BMDL reflects only the statistical constraints imposed by the experimental design. However, a BMDL represents a conservative (in the health-protective sense) value. Even though the data were negative, exposure could have caused a small undetected increase in an adverse health effect. The BMDL is a statistical bound that reflects the potential size of such an increase.
We present benchmark analyses of the Seychelles data similar to those applied to the Iraqi data by the EPA in the development of the methylmercury RfD. These analyses may provide a sounder scientific basis for developing an RfD or MRL from the Seychelles data than a NOAEL estimated from this study. In addition to the Seychelles study, neurologic studies of children from high fish-consuming populations have also been conducted in New Zealand (13,14) and the Faroe Islands (15). These studies should also be considered when developing an exposure advisory for methylmercury. A benchmark analysis similar to ours from the Seychelles data has been developed from the New Zealand data (14).

The Seychelles Study
The Seychelles Child Development Study (7,8,16,17) began in 1989 in the Republic of Seychelles, an archipelago in the Indian Ocean, to measure developmental outcomes in children whose mothers consumed fish containing methylmercury during their pregnancy. More than 700 mother/infant pairs were enrolled in the main study during pregnancy. The children's prenatal mercury exposure was assessed by measuring the concentration of mercury in a segment of maternal hair representative of the hair formed during pregnancy. At 6.5 months of age, the children were given a neurologic examina tion, tests of visual recognition memory (18) and visual attention (19), and the Denver Developmental Screening Test -Revised (20). The Bayley Scales of Infant Development (21) were given at 19 months and again at 29 months. The ages at which a child first walked and first talked were also recorded. At (27)], home environment [HOME (28)], parental education, and others shown in Table 1.
Mercury hair concentrations in maternal hair ranged from 0.5 to 26.7 ppm, and averaged 6.8 ppm. By contrast, hair mercury concentrations in the U.S. are estimated to average 0.5 ppm (4). No adverse outcomes were associated with mercury exposure at any age in the Seychelles study.

Benchmark Methodology
Most of the measured end points in the Seychelles study were recorded as continuous responses, and several dose-response models were applied to these continuous end points. In the k-power model (10), the mean test score (or a transformation thereof) was assumed to be of the form p(a) =+(a)+ K 0 + PI C, + .Ci + . Cm, [1] where d is the average maternal hair mercury level during pregnancy in milligrams per kilogram, C1, ..., Cm are covariates that may also be correlated with test scores, and P 2 0, PO) ...Pm' and K > 1 are parameters to be estimated. The negative sign was assumed whenever smaller responses were considered more adverse. We also assumed that the test scores were normally distributed (possibly after transformation) with a mean, p(a, given by Equation 1 and a SD, a, that was independent of all covariates, including the hair mercury concentration. Inclusion of the parameter, K, allowed a nonlinear dose response. If Kis fixed at K= 1, the k-power model is identical to that used in ordinary multiple linear regression. If test scores below a predetermined value, x0, are considered abnormal, the probability Volume 108, Number 3, March 2000 * Environmental Health Perspectives of an abnormal test score from a child with a mercury hair concentration of dimglkg is 1a) = M[xOg(ad)flI}, [2] where N is the standard normal cumulative distribution function. The probability of an abnormal test score in an unexposed child is Similar formulas apply when larger scores are more adverse (10). The BMD was defined as the maternal mercury hair concentration that caused the probability of an abnormal test score to increase by an amount BMR; i.e., as the hair concentration that satisfied Although it is not explicit in the notation, whenever a single cutoff, x0, is used to define an abnormal response, both p0 and BMD will be dependent on the covariates. Alternatively, instead of specifying the cutoff, x0, for an abnormal response, it is possible to specify the percentage of unexposed individuals who have an abnormal response, i.e., specify p0. If p0 is specified under the k-power model, the definition of the BMD (4) is equivalent to defining the BMD as the exposure that corresponds to a given change (BMRC) in the mean response normalized by the SD; i.e., as the exposure that satisfies [,(BMD) -m(0)]/c= BMRC [5] (10). For given values for p0 and BMR used in Equation 4 there is a corresponding value for BMRC in Equation 5 so that Equations 4 and 5 are equivalent. For example, for p0 = 0.05 and BMR = 0.1, the corresponding value for BMRC = 0.61. Unlike when x0 is specified, when p0 is specified, the BMD does not depend on the covariates (10). This is obvious from the equivalent definition of the BMD given by Equation 5.
In addition to the k-power model, two other dose-response models were applied to continuous data from the Seychelles study. As previously noted, if the underlying data are normally distributed with a common SD, specification of a dose-response model, p(a), for the mean response and specification of a dividing point, x0, between normal and abnormal (or, alternatively, of a proportion, p0, of unexposed subjects whose responses are considered abnormal) determines the probability, Pa), of an adverse response as a function of exposure. Alternatively, we can specify x0 or p0 and a functional form for the probability, P(a), of an adverse response; this determines the mean response, P(a), where p(O) is a linear function of the covariates, as in the k-power model. This approach allows us to define a model for continuous data that corresponds to a predetermined functional form for the probability of an adverse response, and consequently permits the same mathematical dose-response model to be applied to quantal and continuous data (10).  Two such dose-response models for the mean response were applied to the Seychelles data. One model was determined by the Weibull model for quantal data, 'a) =p0 + (1 -po){l -exp[-(a)al]}, [6] and one determined by the logistic model, Pa) =po + (1 -po){l -1/[1 + (0a)'K]1, [7] where P . 0 and K . 1 are estimated parameters that have roughly the same interpretation as the corresponding parameters in the k-power model. As in the k-power model, with these models, the BMD will be a function of the covariates when x0 is specified, but not when p0 is specified. Each of the three models for continuous data (k-power, Weibull, and logistic) was applied to the continuous responses from Seychelles data both with p0 specified and with x0 specified. In addition, a linear model was applied. The linear model was defined by fixing K = 1 in the k-power model. In analyses in which p0 was specified, it was fixed at 0.05, as suggested by the convention of considering 95% of clinical responses in healthy individuals to define the normal range. In analyses in which x0 was specified, it was fixed at 2 SDs (in the adverse direction) from the overall mean response.
For a few of the tests a child's response was recorded as a quantal response (e.g., abnormal/normal), and the results from these tests were modeled using the Weibull (Equation 6) dose-response model for quantal data. In addition, each of the continuous responses was converted into a quantal response by considering a response abnormal if it was more than 2 SDs away (in the adverse direction) from the mean response of the entire cohort, and then analyzed using the Weibull model. In these analyses the BMD was defined in the same way as in the analyses of the continuous responses, i.e., as the mercury hair concentration that corresponded to an increase in 0.1 in the probability of an abnormal response [Equation 4 with BMR = 0. 1].
We conducted each analysis of a continuous response without covariates. In addition, analyses with po specified were also conducted using both an expanded set and a reduced set of covariates (Table 1). These sets of covariates have essentially the same definitions as those in the original analyses (7,8,16,17). In all models for continuous responses with p0 specified, the BMD, as well as corresponding point estimate and confidence limit, does not depend on the covariates. Covariates were not included in analyses of quantal responses or in analyses of continuous responses in which x0 was Environmental Health Perspectives * Volume 108, Number 3. March 2000 specified because, in both of these cases, the BMD would be different for different values of the covariates; therefore, to calculate a BMDL using covariates, values would have to be specified for each of the covariates. As in the original analyses, certain responses were transformed to make them conform more closely to a normal distribution, and outliers were eliminated. A logarithmic transformation was applied to outcomes of age first walked and first talked, and a negative reciprocal transformation was applied to the data on the Fagan attention outcome (18) at 6 months. The data from the psychomotor index were highly skewed at both 19 and 29 months, and were only analyzed in quantal form. An outlier was defined as a response for which the absolute value of the residual was greater than 3 times the SD of the residuals.
Parameter estimates were obtained using the maximum likelihood method, and statistical confidence bounds (e.g., the BMDL) were computed by the profile likelihood method (29). The BMDL was defined conventionally as the 95% statistical lower confidence bound on the BMD. In the implementation of the profile likelihood method, a model was first reparameterized so that the BMD was included explicitly as a parameter in place of P. Next, values (maximum likelihood estimates) of the parameters were obtained so that the log-likelihood, L, of the data obtained its largest possible value, Lo.
The maximum likelihood estimate of the BMD could theoretically be infinite. The BMDL was then determined as the smallest value of the BMD that satisfied the equation, {2[Lo -L(BMD)]}112 = 1.645, where 1.645 is the 95th percentile of the standard normal distribution, and L(BMD) is the function of the BMD (only) obtained by fixing the value of the BMD and maximizing the log-likelihood with respect to the remaining parameters. These calculations were made using computer programs specifically designed for performing benchmark calculations (30). Table 1 provides specific BMDLs obtained from each neurologic test using the Weibull model (applied to both continuous and quantal responses) and the k-power model. Table 2 provides averages and ranges of BMDLs over all neurologic tests for each type of benchmark analysis applied. Figure 1 illustrates the calculation of the BMDL for five continuous responses, based on the k-power model, with po specified and using the full set of covariates. In this case (using po = 0.05 and BMR = 0.1), an equivalent definition of the BMD is the hair concentration that causes the mean response to differ from the mean response of unexposed individuals, relative to 6, by BMRC = 0.61 [Equation 5]. Each plotted point in Figure 1 represents a residual (measured response minus the mean response predicted by the model) with the mercury effect and the constant term added back in [i.e., (residual) + PO ± (a)jK]. The solid line represents the estimated mean response as a function of hair mercury, PO ± (pia)cf. The dotted curve represents the statistical bounding mean dose response associated with the BMDL (i.e., PO ± (P'a , where ,B', and K' are the values of P and Kassociated with the calculation of the BMDL). The dotted line may be used to approximate BMDLs associated with other values of p0 and BMR by using the corresponding value for BMRC. However, this approach provides only an approximation because ,', and K' may be slightly different for different values ofp0 and BMR. Values of BMRC that correspond to some specific values for p0 and BMR are shown in Table 3. Equivalent BMRCs that correspond to other values ofp0 and BMR may be obtained using the formula BMRC = Nf1(1 -po) -N-'(1 -po -BMR), where Nf1 is the inverse of the standard normal distribution function (10).

Discussion
The point estimates of the BMDs (data not shown) were in many instances theoretically infinite because many of the dose-response trends, although nonsignificant, were in the direction of a beneficial effect of mercury. However, the BMDLs were all finite, and their values reflect the potential magnitude of any small effect of mercury that went undetected in this study.
BMDLs obtained from continuous data using the Weibull, logistic, and k-power models were similar. Application of the linear model resulted in larger BMDLs (data not shown) that were outside the range of maternal hair levels observed in the study. The inclusion of covariates had little effect on the BMDLs. Likewise, BMDLs were similar whether based on specification of the background response, p0, or of x0, the magnitude of a response considered abnormal. However, BMDLs based on quantal data were consistently smaller than those based on the corresponding continuous data. For example, the BMDLs computed from 12 quantal responses using the Weibull distribution (Table 4) had a mean of 22.0 ppm and a range of 19.4-23.7 ppm, whereas the 12 BMDLs computed in like manner (Weibull distribution, no covariates) from the corresponding continuous data had a mean of 25.0 ppm and a range of 23.1-27.2 ppm. One possible reason for this difference is that collapsing a continuous data point into a quantal (yes/no) response results in a loss of information, which can cause confidence intervals on the BMD to become wider (i.e., cause the BMDL to become smaller). For this reason, we believe that a BMDL derived from a continuous response is preferred over one derived from a quantal end point obtained by collapsing the corresponding continuous response, provided the distributional  (Table 4). Several choices were necessary to calculate our BMDLs, including how the background response was specified (e.g., the values of p0 or x0 selected), the value selected for the benchmark risk (BMR), and the doseresponse model used. These decisions are generic in the sense that they are largely unrelated to the type of exposure, health response, or quality of data. The Weibull doseresponse model was used by the EPA to determine the current RfD (4), and was also used in a number of benchmark analyses listed in the IRIS database (12). Our    pregnancy. Crump et al. (14) performed benchmark calculations on five end points from this study by applying the k-power model in the same manner as in the present paper. The results of this analysis were highly dependent on the test scores of one child whose mother had a hair mercury level of 86 ppm during pregnancy, which was more than 4 times the hair mercury level of any other mother. When this child's scores were included in the analysis, BMDLs ranged from 17 to 24 ppm, which is similar to the range of BMDLs determined in the present study from the Seychelles data. Although the scores of this child were not outliers, when they were omitted the BMDLs ranged from 7.4 to 10 ppm. Crump et al. (32) developed BMDLs from the Iraqi study (5) by applying the k-power and Weibull models to continuous data on age first walked, age first talked, and neurologic score. These calculations were made for a range of values of pO and BMR.
Considering the analyses that were comparable to those in the present study (po = 0.05 and BMR = 0.1), the BMDLs obtained ranged from 54 to 152 ppm. This range was consistent with the conclusion by Crump et al. (32) based on other analyses of the Iraqi data that there was no conclusive evidence of a mercury effect in this study below a maternal hair level of 80 ppm. The maternal hair level used in the present study was the peak level during pregnancy, whereas the average level during pregnancy was used in the Seychelles and New Zealand studies. BMDLs from the Iraqi study would have been smaller if they had been based on the average level.
To derive its RfD from the Iraqi study, the EPA (4) defined an overall quantal measure of neurologic health defined in terms of whether a child was abnormal in any one of three outcomes (first walked after 18 months of age, first talked after 24 months of age, or had a score > 3 on a neurologic examination). This combined response was grouped into five dose groups according to the mothers' peak mercury hair concentration during pregnancy. A Weibull dose response applied to these grouped data resulted in a BMDL of 11 ppm mercury in mothers' hair. There are several potential reasons why this BMDL was lower than the range of 54 to 152 obtained by Crump et al. (32). First, collapsing continuous responses into quantal aAbnormal defined as a response > 2 SDs in adverse direction from mean response of entire cohort. bAbnormal defined so that 5% of the responses are abnormal (po = 0.05).
responses results in a loss of information and consequently generally produces wider statistical confidence bounds. Second, combining three outcomes to produce an overall response resulted in a very liberal definition of abnormal. Based on this definition, 19% of the Iraqi children would be expected to be abnormal, even without any prenatal mercury exposure. In general, increasing the background response (i.e., increasing po) results in a lower BMD and BMDL. Third, the EPA assigned the geometric average hair level to each exposure group. Crump (33) noted that, whereas grouping of data should be avoided whenever possible, assigning an arithmetic average exposure to grouped data generally provides a more accurate assessment of risk than using a geometric average. Because an arithmetic average is always larger than a geometric average, the use of an arithmetic average exposure will generally result in a larger BMD. Whatever the reasons for the different BMDLs obtained by the EPA (4) and Crump et al. (32), the limitations in the Iraqi study suggest that this study provides a less adequate basis for deriving an RfD for methylmercury than the Seychelles study.