Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Can J Stat. Author manuscript; available in PMC Apr 18, 2013.
Published in final edited form as:
Can J Stat. Sep 1, 2011; 39(3): 498–509.
Published online Jul 27, 2011. doi:  10.1002/cjs.10116
PMCID: PMC3630080

Measurement error modeling and nutritional epidemiology association analyses


This paper summarizes the results of a Nutrient Biomarker Study in the Women’s Health Initiative, and its application to studies of the association between energy and protein consumption and the risk of major cancers and cardiovascular diseases. The presentation emphasizes measurement error modeling and related data analysis methods, since addressing measurement issues appears to be central to these topics and to progress in nutritional epidemiology more generally. The manner in which body mass index is modeled in disease association analysis is particularly challenging, since it could serve as a mediator or as a confounder of the association, and at the same time contributes valuably to energy and protein consumption assessment. A hazard ratio parameter estimation procedure that acknowledges body mass index as a possible mediating variable is described and applied. Some aspects of the future nutritional epidemiology research agenda are briefly discussed, including an ongoing human feeding study to develop biomarkers for additional dietary components.

Keywords: cancer, cardiovascular disease, diet, epidemiology, failure time data, measurement error


It is a pleasure to contribute a paper in honour of our esteemed colleagues Drs. Jack Kalbfleisch and Jerry Lawless. Jack and Jerry have each had a tremendous impact on statistical and biostatistical research methods, and on the analysis of failure time and life history analyses in particular. Their theoretical and applied contributions have been recognized by many Canadian and international awards. I was fortunate to co-author two editions of a book on failure time methods with Jack (Kalbfleisch & Prentice, 2002), which summarized the literature on a broad range of failure time topics, with emphasis on hazard ratio regression methods emerging from the seminal paper Cox (1972). One of those topics was that of estimating hazard ratio coe cients when there is measurement error in some elements of the regression variable. This topic is particularly central in the important public health area of nutritional epidemiology, which has depended almost exclusively on self-reported dietary consumptions, which may be a ected by complex and influential assessment errors.

There is a pressing need for reliable information on recommended dietary and physical activity patterns for body weight maintenance and for chronic disease risk reduction. While sensible guidelines and recommendations are available from various organizations, these may lack enough specificity and force to influence individual dietary choices and the type of societal changes that may be needed to begin to reverse the obesity epidemic in Western societies and to achieve related public health goals.

An international expert consultation that summarized the world literature on diet, nutrition and the prevention of chronic diseases (Diet, Nutrition and the Prevention of Chronic Diseases 2003) does not list energy consumption among the factors that are convincingly, probably or possibly associated with cardiovascular disease risk, though overweight is described as convincingly associated with increased risk. Similarly an expert panel reviewing the world literature on nutrition and cancer prevention (World Cancer Research Fund/American Institutes of Cancer Research, 1997) writes that ‘In the view of the panel, the effect of energy on cancer is best assessed by examining the data on related factors: rate of growth, body mass, and physical activity’, and that the ‘significance of the data on energy intake and cancer risk in humans is unclear’.

These summaries reflect considerable uncertainty about the reliability of nutrient consumption estimates from self-report dietary data, upon which dietary association studies are typically based. The food frequency questionnaire (FFQ) has been ubiquitous in nutritional epidemiology for the past 25 years, in part because its self-administered and machine-readable features make it practical for application to large study cohorts. The FFQ is believed to have better measurement properties for nutrient densities than for absolute nutrient consumption (Willett et al. 1985), and most epidemiologic studies only report associations for nutrients having some form of energy consumption adjustment.

There are a small number of nutrients for which a well established biomarker of short-term consumption has been developed, including a doubly-labeled water (DLW) assessment of energy consumption over a 14 day period (Schoeller 1999), and a urinary nitrogen (UN) assessment of protein consumption from a 24 hour urine collection (Bingham 2003). These urinary recovery biomarkers (Kaaks et al. 2002), provide objective estimates of short-term consumption among persons in energy balance, with measurement error that is plausibly independent of study subject characteristics such as body mass index (BMI), defined as weight in kg divided by the square of height in meters, age, and gender, and importantly is plausibly independent also of dietary self-report measurement error. However, it is not practical to obtain these biomarkers for the tens of thousands of persons in a typical epidemiology cohort study, and application in advance of disease diagnosis is essential. It follows that the practical study design entails biomarker application to a randomly selected subcohort of a study cohort, in conjunction with concurrent self-report data, followed by use of the biomarker data to produce measurement error corrected consumption estimates throughout study cohorts for use in disease association analyses.

These estimates can be obtained by simple linear regression of log-transformed biomarker values on corresponding log-FFQ values and other study subject characteristics, including BMI and age. They arise from a measurement model (Prentice et al. 2002)


where the biomarker W (e.g. log DLW energy) is assumed to adhere to a classical measurement model with error e, where Z is the targeted consumption (e.g., logarithm of average daily energy consumption over a certain time period); Q is the corresponding log FFQ energy consumption; V is a vector of characteristics that may also relate to Q, for example through systematic biases in the FFQ assessment; a0, a1, a2 are parameters to be estimated, and ε is a random error term that could include a random effect component. A joint normality assumption for (Z, V, ε) then gives


and therefore


under the crucial assumption that the biomarker and self-report errors e and ε are statistically independent. It follows that regression of W on the self-report Q and pertinent characteristics V, as in Table 1, yields an estimate of Z that has been corrected for assessment error under model 1. This ‘calibrated’ consumption estimate is readily calculated from (Q, V) values for all members of the study cohort. Note from Table 1 that the FFQ provides only a weak signal for energy estimation, with a coe cient of 0.062 rather than a coe cient near 1 as would be expected from an accurate and precise consumption estimate. Note also that these calibrated estimates can be obtained under the less restrictive assumption of joint normality of (Z, ε) given V.

Table 1
Regression calibration coefficients for log-transformed total energy, total protein, and protein density (Neuhouser et al. 2008).


We pursued this strategy by conducting a Nutrient Biomarker Study (NBS) among 544 post-menopausal women enrolled in the Women’s Health Initiative (WHI) Dietary Modification Trial, during 2004-2005 (WHI Study Group 1998). The NBS involved DLW and UN biomarkers, a concurrent FFQ and some additional questionnaire information. The randomized controlled DM trial was conducted among 48,835 postmenopausal women in the age range 50–79 when enrolled during 1993–1998. It tested whether a low-fat dietary pattern (40% of women) would reduce cancer risk compared to a usual diet comparison group (60% of women). Weight-stable women without a diagnosis of cancer or cardiovascular disease during trial follow-up were randomly selected for NBS participation at a representative 12 of the 40 participating WHI clinical centers, 50% from the intervention and 50% from the comparison group. A 20% reliability subsample repeated the entire NBS protocol about 6 months after the original data collection.

Additional detail on the NBS is given in Neuhouser et al. (2008), where calibration equations for the assessment of energy, protein, and protein density (percent of energy from protein) are also given, a reduced form of which are provided here in Table 1.

In fact, FFQ energy consumption estimates exhibit strong systematic biases, with overweight and obese women underestimating to a much greater extent than normal weight women, and with younger postmenopausal women underestimating to a greater extent than older postmenopausal women. The full calibration equations included also some moderate dependencies on income and socioeconomic factors (Neuhouser et al. 2008). Note also from Table 1 that the FFQ assessment appears to have better measurement properties for protein density than for absolute protein or energy, as expected.


These calibration equations were applied to FFQ assessments and other study subject characteristics (Q, V) obtained early in WHI follow-up, and calibrated consumption estimates were associated with the subsequent incidence of invasive cancers and cardiovascular diseases in WHI cohorts. The WHI cohorts included the comparison group in the Dietary Modification Trial, and the companion WHI Observational Study, a prospective cohort study among 93,676 postmenopausal women drawn from the same catchment population with much commonality in data collection and outcome ascertainment with the clinical trial (WHI Study Group 1998). Table 2 shows estimated hazard ratios for a 20% consumption increment, with and without biomarker calibration of the FFQ consumption estimates, for several chronic disease categories, from Prentice et al. (2009a) and Prentice et al. (2011). These hazard ratio estimates arise from Cox models with log-hazard ratios that are linear in the estimated log-nutrient consumption, so that the hazard ratio for a fractional increase in consumption is assumed to be constant across the consumption distribution. A bootstrap procedure was used to estimate variances and confidence intervals for corresponding hazard ratio parameter estimates. Without calibration there was little evidence of association of any of the cancer or cardiovascular disease categories listed with energy, protein, or protein density. In contrast, calibrated energy consumption was positively associated with breast, colon, and total invasive cancer risk and with coronary heart disease, and was marginally inversely associated with stroke incidence. Calibrated protein consumption was also positively associated with breast and total cancer, while there was some evidence of an inverse association between calibrated protein density and total cancer incidence. These analyses did not include BMI in the disease risk model, and BMI was highly correlated (0.81) with biomarker log-energy consumption in the NBS, less so with log-protein consumption (0.46) and only weakly correlated (0.12) with log-protein density. Table 2 also gives hazard ratio estimates for calibrated energy in analyses that included BMI in the log-hazard ratio regression model (‘BMI adjusted’ analyses). The positive energy associations essentially disappear with this addition, while an inverse association with stroke becomes more pronounced.

Table 2
Hazard ratio estimates for a 20% increment in nutrient consumption in Women’s Health Initative cohort study of over 80,000 women (Prentice et al. 2009, Prentice et al. 2010).


The analyses of Table 2 present a rare look at absolute energy and absolute protein consumption in relation to the risk of important chronic diseases. However the interpretation of these associations is complicated by the multiple possible roles and influences of BMI. First, dietary consumption patterns tend to track over many years, and body fat deposition and an increasing BMI is an expected consequence of an excessive energy consumption. Hence a disease association with energy may be substantially mediated by body fat accumulation, and it would represent over-control to include BMI in the hazard ratio regression analyses in assessing the full dietary association. On the other hand overweight and obese persons expend relatively more energy carrying out routine tasks of daily living and it is possible that there is some confounding associated with BMI, and its exclusion from the disease risk model may yield hazard ratio estimates that are biased away from the null. These mediating and confounding possibilities would exist even if energy consumption was measured without error. Here, however, BMI also serves as an important component of the energy calibration procedure, with possible further complexity in the interpretation of the calibrated energy association analysis.

Dr. Laurence Freedman (Gertner Institute, Tel Aviv, Israel) has pointed out to us (personal communication) that calibrated association analyses of the type shown in Table 2 without BMI adjustment may incorporate some bias due to the inclusion of BMI in the calibration equations. Furthermore, in his ongoing work with collaborators in the US National Cancer Institute it seemed that biases of this type could be substantially mitigated by obtaining association parameter estimates for the dietary variables from the BMI-adjusted analyses.

To examine this issue in the present context consider the BMI adjusted analysis of the type shown in Table 2. These arise from an underlying Cox model hazard rate model that can be written


where U is BMI and X is comprised of the unmeasured dietary variable Z of interest (first element) along with other variables V involved in developing the calibrated estimate (Z^) of Z, and other variables included in the disease risk model to control confounding (the elements of β2 for factors involved in V that are not included in the disease rate model are set to zero).

Data on (U, X) and disease incidence times, which may be subject to independent right censoring (e.g., Kalbfleisch & Prentice 2002), do not allow one to distinguish mediating from confounding roles for U(1 ≠ 0). The induced hazard ratio model for X alone can be written


which under a rare disease assumption and a joint normality assumption for (U, X) is well approximated by


where β* = β2 + (varX)−1cov(X, U)β1. Hence we may consider the first element of β* as our target of estimation for the full dietary effect of Z on the hazard ratio. If Z, and hence X, were directly measurable one could reliably estimate β* from estimates of β1 and β2 using the sample covariance of X and U and the sample variance for X. This idea has been proposed by Freedman and colleagues in a draft paper they shared with us entitled ‘Using regression calibration equations that combine self-reported intake and biomarker measures to obtain unbiased estimates and more powerful tests of dietary associations.’

Now suppose that Z is measured by W and Q with error according to (1), including possible systematic bias related to V. One can write


where Z^ is the calibrated estimate of Z. Under the natural assumption that Z^ is unrelated to disease risk given (X, U) one has that (t; U, X, Z^) equals (2) from which


where X^ equals X except that Z^ is substituted for Z, and β21 is the corresponding (first) element of β2. Under the measurement model (1) e is independent of (U, Z^) and the residuals ε^ are uncorrelated with (U, X^) (assuming that all pertinent confounding factors are included in the calibration equation that generates Z^). Hence, under a rare disease assumption


to a good approximation. It follows that the hazard ratio estimators from analyses of the calibrated estimator that include BMI should be approximately unbiased for the parameters of model (2) under our modeling assumptions.

The induced hazard ratio model for X^ from (5) is approximately


where β~=β2+(varX^)1cov(X^,U)β1. This parameter β~ is expected to agree closely with that based on application of the Cox model directly to Z^ and potential confounding factors, as led to the calibrated consumption hazard ratio estimators (without BMI adjustment) shown in Table 2.

However β~ may differ somewhat from the desired target β* in (3). Specifically, even though cov(X^, U) would differ trivially from cov(X, U) for a su ciently large biomarker sample, and is readily estimated by the sample covariances with W in place of Z, the variance matrix varX^ may differ from varX in (3), depending on the comparative magnitudes of the variances of e and ε^ in (4). Specifically, one can estimate varX by using sample variances and covariances for all elements except those in the first row and column. Except for the (1,1) element, varZ, once can estimate the first row and column of varX by biomarker study sample covariances between W and X. The estimation of varZ, however, is a more delicate issue that forces us to be specific about the time period that is intended to be covered by the dietary variable.

As noted above the NBS included a reliability subsample among 111 women, with primary and reliability subsamples separated by about 6 months. If one defines the targeted dietary variables as that pertaining to the recent diet, say over the past 6 months, then it will be plausible for the measurement errors e1 and e2 that attend the two biomarker assessments W1 and W2 for women in the reliability subsample to be approximately uncorrelated in which case varZ can be estimated by the sample variance of W in the biomarker study, minus 0.5 times the sample variance of (W1W2) in the reliability subsample. On the other hand, relative to a dietary variable that pertains to consumption over the preceding years or decades that may be relevant to disease risk and to body fat accumulation, the reliability biomarker measurement errors can be expected to have a correlation ρ that is positive, in which case varZ can be estimated by the sample variance of W minus 0.5 (1 − ρ)−1 times the sample variance of (W1W2). Replicate biomarker data over the long period of time that may be relevant to those dietary associations would be needed to directly assess the temporal patterns of biomarker measurement error correlation. Here, instead, we provide sensitivity analyses for some choices of ρ ranging from ρ = 0, as may be appropriate for recent diet, to positive values as may be relevant to long-term dietary patterns. The sensitivity analyses target the coe cient of Z (first element of X) in β* in (3).


In Table 3 we provide energy and protein analyses using the methods of the previous section for the full range of cardiovascular disease outcomes considered in Prentice et al. (2011) where details of the variables included for confounding control for each outcome category and of the stratification of the Cox model can be found. The left column of Table 3 shows hazard ratios for a 20% consumption increment and estimated 95% confidence intervals from 500 bootstrap samples, as given in Prentice et al. (2011). The second column shows corresponding estimates based on the approximation (6). In applying this approximation varX^ was estimated as described in the preceding section, with varZ^ estimated by its sample variance. good agreement the columns The between two for each of energy, protein, and protein density attests to the adequacy of this approximation in this context, where about 5% of study subjects experienced the most global cardiovascular disease outcome (total CHD plus total stroke plus coronary artery bypass graft plus percutaneous coronary intervention). The following Table 3 columns present corresponding hazard ratio estimates based on the estimated β* in (3), for certain specific biomarker measurement error correlations in the NBS reliability subsample, with confidence intervals again based on 500 bootstrap samples.

Table 3
Hazard ratio estimates for a 20% increment in calibrated FFQ energy, protein and protein density in relation to various cardiovascular disease outcomes, from 80,330 women enrolled in the WHI Dietary Modification Trial Comparison Group or Observational ...

The fact that the underlying dietary variable has positive variance limits the possible values of the biomarker measurement error correlation to be <0.69 for energy, <0.59 for protein, and <0.53 for protein density. In fact the substantial local variation in protein over time presumably leads to measurement error correlations that are relatively small for both protein and protein density, even if Z is defined as the logarithm of dietary consumption over a period of time as short as 6 months or a year. The local temporal variations in energy consumption are likely to be much smaller, however. Energy biomarker measurement error correlations may be close to zero, if (log) consumption over the short biomarker ascertainment period of two weeks is compared to average (log) consumption over a time period as short as a few months. In contrast biomarker measurement error correlations relative to a long-term average consumption over some years or decades are likely to be positive and sizeable, since the reliability sample assessments at two close points in time can be expected to have commonalities arising from age-related changes in energy consumption and to other dietary changes during adulthood. With these thoughts in mind we present results in Table 3 for ρ = 0 and ρ = 0.375 for all three dietary variables, and also for ρ = 0.5 for energy. The interpretation of the protein and protein density analyses is rather insensitive to biomarker measurement error correlation. The energy association analyses are more sensitive. At a value of ρ = 0.5, that we think to be plausible for energy in this setting for long-term energy consumption, the hazard ratio estimates agree closely with our published values given in the first column, though confidence intervals are somewhat wider. Hence, these analyses provide some support for positive associations between long-term energy consumption and the risk of coronary heart disease, and the risk of total cardiovascular disease including coronary artery bypass graft and percutaneous coronary intervention, with those associations largely attributable to related temporal increases in BMI; and supportive of an inverse association between protein consumption, over the short or long term, and the risks of stroke.


Developments of the type described above can help to define the future research agenda in nutritional epidemiology. It is clear that association studies of energy and protein consumption require a careful account of measurement error in dietary assessment for reliable inferences. Studies using suitable biomarkers seem to provide the logical next step to enhancing the reliability of dietary association studies, especially for studies involving the absolute intake of nutrients or foods. The issues described in the preceding section imply a need for biomarker reliability studies that take place over some months or years, along with concurrent dietary and physical activity assessment data, to elucidate the complex issues related to diet, physical activity, energy balance, body fat deposition, and chronic disease risk.

A major limitation in pursuing the biomarker approach to nutritional epidemiology arises from the rather few nutrients or dietary components for which there is an established biomarker that plausibly adheres to a classical measurement model, as in (1). We have recently initiated a human feeding study among 150 WHI participants in Seattle as an attempt to develop and evaluate additional such biomarkers. A human feeding study provides the possibility of directly assessing short-term nutrient consumption by providing food and drink to each participant over the feeding period. We will employ a two week feeding period in which each woman will be provided a diet that approximates her usual diet so that blood and urine measures stabilize quickly, and so that the dietary variation in the study population is retained. Food and drink having well characterized nutrient composition will be used in diet formulation. Our plan is to examine the extent to which variation in the provided diet can be ‘explained’ by relevant urine and blood measures and individual characteristics, with regression equations that may explain 50% or more of the actual variation considered as potential biomarkers. Both candidate biomarkers and biomarker discovery e orts will be included. Some additional detail on these plans is given in Prentice et al. (2009b).


This work was partially supported by grants from the US National Cancer Institute, and by contract from the National Heart, Lung and Blood Institute for the Women’s Health Initiative.


MSC 2000 : Primary 92B15; secondary 92C60 or 92D30.


  • Bingham SA. Urine Nitrogen as a Biomarker for the Validation of Dietary Protein Intake. Journal of Nutrition. 2003;133:921S–924S. [PubMed]
  • Cox DR. Regression models and life tables (with discussion) Journal of the Royal Statistical Society, Series B. 1972;34:187–220.
  • Joint WHO/FAO Expert Consultation on Diet, Nutrition and the Prevention of Chronic Diseases Diet, Nutrition and the Prevention of Chronic Disease:. Report of a joint WHO/FAO expert consultation. WHO Technical Report Series. 2003;2003;916:88.
  • Kaaks R, Ferrari P, Ciampi A, Plummer M, Riboli E. Uses and limitations of statistical accounting for random error correlations, in the validation of dietary questionnaire assessments. Public Health Nutrition. 2002;5:969–976. [PubMed]
  • Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. Second edition Wiley; New York: 2002.
  • Neuhouser ML, Tinker L, Shaw PA, Schoeller D, Bingham SA, Horn LV, Beresford SA, Caan B, Thomson C, Satterfield S, Kuller L, Heiss G, Smit E, Sarto G, Ockene J, Stefanick ML, Assaf A, Runswick S, Prentice RL. Use of recovery biomarkers to calibrate nutrient consumption self-reports in the Women’s Health Initiative. American Journal of Epidemiology. 2008;167:1247–1259. [PubMed]
  • Prentice RL, Huang Y, Kuller LH, Tinker LF, Van Horn L, Stefanick ML, Sarto G, Ockene J, Johnson KC. Biomarker-calibrated energy and protein consumption and cardiovascular disease risk among postmenopausal women. Epidemiology. 2011;22:170–179. [PMC free article] [PubMed]
  • Prentice RL, Huang Y, Tinker LF, Beresford SA, Lampe JW, Neuhouser ML. Statistical Aspects of the Use of Biomarkers in Nutritional Epidemiology Research. Statistics in BioSciences. 2009b;1:112–123. [PMC free article] [PubMed]
  • Prentice RL, Shaw PA, Bingham SA, Beresford SA, Caan B, Neuhouser ML, Patterson RE, Stefanick ML, Satterfield S, Thomson CA, Snetselaar L, Thomas A, Tinker LF. Biomarker-calibrated energy and protein consumption and increased cancer risk among postmenopausal women. American Journal of Epidemiology. 2009a;169:977–989. [PMC free article] [PubMed]
  • Prentice RL, Sugar E, Wang CY, Neuhouser M, Patterson R. Research strategies and the use of nutrient biomarkers in studies of diet and chronic disease. Public Health Nutrition. 2002;5:977–984. [PubMed]
  • Schoeller DA. Recent advances from application of doubly labeled water to measurement of human energy expenditure. Journal of Nutrition. 1999;129:1765–1768. [PubMed]
  • Willett WC, Sampson L, Stampfer MJ, Rosner B, Bain C, Witschi J, Hennekens CH, Speizer FE. Reproducibility and validity of a semiquantitative food frequency questionnaire. American Journal of Epidemiology. 1985;122:51–65. [PubMed]
  • The Women’s Health Initiative Study Group Design of the Women’s Health Initiative clinical trial and observational study. Controlled Clinical Trials. 1998;19:61–109. [PubMed]
  • World Cancer Research Fund & American Institute for Cancer Research . Food, Nutrition and the Prevention of Cancer: a global perspective. American Institute for Cancer Research; Washington, DC.: 1997. p. 371.
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...