A Time and Place for Causal Inference Methods in Perinatal and Paediatric Epidemiology
In this issue of Paediatric and Perinatal Epidemiology, Sudan et al.1 describe their analysis of cell phone exposure and hearing loss among children enrolled in the Danish National Birth Cohort. Sudan and colleagues have provided a thoughtful analysis and incorporated several relatively new analytic methods. These include the use of directed acyclic graphs (DAGs),2 marginal structural models (MSM),3 and doubly robust estimators (DRE).4 In addition, the authors also present results from a sensitivity analyses for unmeasured confounding and outcome misclassification.5 While we applaud their efforts, we would like to provide a rationale for the use – and misuse – of these and other methods for causal inference.
To illustrate the causal relationships assumed in their analysis, Sudan and colleagues provided a directed acyclic graph. DAGs (also known as causal diagrams) are useful tools for inferring statistical associations from assumed underlying causal relationships2. Previous research, subject-matter knowledge and relationships observed within the empirical data can inform the relationships in a DAG. Confounders can be identified from the DAG as factors along a biasingor confounding pathway. From a faithful DAG, one can discern the smallest set of covariates necessary to control for bias, which is referred to as the minimal sufficient adjustment set.6 DAGs can be used to identify colliders too (factors that are also along a pathway from the exposure to the outcome but are caused by two other factors along the pathway6). The presence of a collider blocks the pathway without need for covariate adjustment, and in fact, adjustment for a collider can induce bias. Critics of the use of DAGs have argued that causal reality is more complicated than the graph can represent and it is impossible to distinguish between all the different pathways, which renders them of little utility in practice. Fortunately, a freely-available useful tool for graphing DAGs and identifying the minimal sufficient set of confounders for adjustment has been developed, DAGitty (http://www.dagitty.net)7, and even very complicated scenarios can be evaluated with ease.
Using the DAG proposed by Sudan et al, we investigated two aspects of their analysis that may have led to a bias in their findings. First, although reduced hearing at age 18 months (Y1) was along its own biasing pathway from X2 to Y2 (Fig 1), they did not include this variable as a confounder in their models, which could have led to residual confounding. However, as evaluated in the first part of their analysis, there appeared to be no statistical association between Y1 and X2; the bias due to residual confounding therefore was likely minimal. Nonetheless, if the association between Y1 and Y2 were strong and had the prevalence of Y1 been high, the bias could have been greater. Second, Sudan et al sensibly grouped factors that they presumed occurred together causally in the DAG, which reduced its complexity. For example, gestational age, breastfeeding and ear infection up to 18 months were grouped together as factor “B.” Before grouping factors, care should be taken to determine that grouped factors are all affected by the exact same set of causes and, similarly, all factors cause the exact same set of effects. One way to cross-check if such a bias may have been introduced as a result of grouping factors is to separate these variables in the DAG and add in hypothetical biasing paths. For example, if an unmeasured confounder (U1) affected both the outcome (Y2) and gestational age (B2), and another unmeasured confounder (U2) affected both the exposure (X2) and gestational age (B2), then either one of those confounders must then be then be included in the minimal sufficient adjustment set to control for bias (Fig 2).
Directed acyclic graph from Sudan et al1, modified to identify exposure and outcome variable from their primary analysis with the biasing path X2 ← Y1 → Y2 bolded
Minimal sufficient adjustment set for estimating the total effect of X on Y2: {A, B, X1, Y1}
X1: Prenatal cell phone exposure
X2: Postnatal cell phone exposure (cell phone exposure at age 7)
Y1: Reduced hearing at age 18 months
Y2: Hearing loss at age 7 years
A: Mother’s age, socio-occupational status, prenatal smoking, prenatal alcohol use, fever during pregnancy, sex of child
B: Gestational age at birth, breastfeeding, ear infection up to 18 months
C: Inner ear inflammation up to age 7 years
Directed acyclic graph from Sudan et al1, with factor B separated into B1, B2, and B3 and with hypothetical U1 and U2 added
Minimal sufficient adjustment sets for estimating the total effect of X2 on Y2: {A, B1, B2, B3, X1, Y1, U1}, or {A, B1, B2, B3, X1, Y1, U2}
B1: Gestational age at birth
B2: Breastfeeding
B3: Ear infection up to 18 months
Another important application for DAGs is to lay the foundation for performing sensitivity analyses for unmeasured confounding and misclassification. Sudan et al. consider the effect of an unmeasured confounder on the unadjusted relationship betweenX2 and Y2 assuming different scenarios based on the prevalence of the unmeasured confounder and the strength of the confounding associations. They speculated that possible unmeasured confounders include the use of headphones or other sound-delivery devices. Adding this unmeasured confounder to the DAG illustrates that this type of confounder, presumed as proximal to both X2 and Y2, would then need to be included in the minimal sufficient adjustment set. Therefore, adequately evaluating its potential for bias would be conditional on adjustment for the other variables in the minimal sufficient set. Probabilistic methods have been developed to allow for just such an adjusted bias analysis.5, 8, 9 Further, adjusted sensitivity analyses for unmeasured confounding, selection bias and information bias (i.e. misclassification), given that they often co-occur, should be performed in conjunction.5 By identifying the placement of the unmeasured confounder in the DAG, it is also possible that this factor is along a pathway between the exposure and the outcome that has already been blocked (by a collider or adjustment), in which case the potential bias is no longer a concern. For unmeasured confounders with an unknown placement in the DAG, the analyst must first consider possible placements of the factor within the DAG before embarking on a sensitivity analysis. The set of covariates for adjustment will depend on the hypothesized placement(s) of the unknown confounder.
DAGs can also be used to identify situations where the effects of the exposure on the outcome can’t be estimated without bias using conventional methods. For example, had Sudan et al been interested in the cumulative or joint effects of prenatal cell phone exposure (X1) and cell phone use at age 7(X2) on hearing loss at age 7 (Y2), there is no adjustment set of covariates that would estimate an unbiased effect using conventional methods with their presented data. In order to estimate the cumulative or joint effect, associations would first need to be “removed” from the DAG; in particular, the arrows between B → X2 and Y1 → X2 would need to be removed. B and Y1 are confounders that are affected by the prior exposure (X1)(Fig 3). Including them as confounders in the model would adjust for confounding, but at the same time would block two causal pathways from X1 to Y2 and lead to over-adjustment bias.6 One way to handle this analytical problem is reweighting the data using the inverse of the probability of exposure (or treatment) weight, which can further be stabilised to increase precision.3 After this reweighting, the distribution of the time-varying confounders within each exposure group in the pseudo-population is the same as that for the total original population. Then the MSM(usually in the form of a conventional regression model) can be run using the reweighted data. This form of the structural model should be considered carefully, as different functional forms of the exposure metric may affect the fit of the model.10 Further information on implementing stabilised weights and MSM for time-varying confounding, including SAS and STATA code, is available.11–14
Sudan et al. utilize MSM, with inverse probability of exposure weighting, in their analysis of cell phone use at age 7 (X2) and hearing loss at age 7 (Y2). Because the authors’ research question concerned a time-fixed exposure, the use of MSM was not necessary to control for time-varying confounding.15 Use of MSM in this case was simply an alternative to traditional adjusted regression modeling. However, the effect estimate from MSM is different conceptually from that estimated using adjusted relative risk regression modeling. MSM estimates the marginal effect, which is similar to the effect estimate from a hypothetical randomized controlled trial while an effect estimate from an adjusted relative risk regression model is the effect conditional on a set of factors.14, 16 The effect estimate Sudan et al. obtained from the MSM was very similar to their crude effect estimate using traditional regression; this was because the bias due to confounding by A, B and X1 was minimal. The penalty for using a MSM here was a slightly wider confidence interval compared with the crude analysis.
Doubly robust estimators, also employed by Sudan et al., build on the weighting procedure implemented for MSM.4 The authors do not specify how DRE was implemented in their analysis, so here we provide a brief overview. In DRE, two models are specified: a treatment model and an outcome model. As long as at least one model is correctly specified and the necessary assumptions are met (see below) then the effect estimate is unbiased. There are many ways to implement DRE. One approach, as outlined by Funk et al,17 is a three step process: (i) predicted outcomes are estimated for each individual under each exposure condition,(ii) a propensity score is calculated for each individual, and (iii) the data are combined using propensity score augmentation. When presenting methods and results from DRE, it is useful to describe the form of the treatment and outcome models and how covariates were selected for each. Excellent descriptions of DRE using SAS and STATA are available.14, 17–19
Under certain assumptions MSMs and DRE can yield effect estimates with a causal interpretation-that is, approximate the results from a hypothetical randomised controlled trial. These include: exchangeability, positivity, consistency, no model misspecification, and temporality-assumptions that are also necessary for an unbiased effect estimate using conventional methods.16, 20 Exchangeability refers to no residual confounding or selection bias. Although this is untestable, sensitivity analyses can help determine if there might be violations of this assumption. Positivity assumes there are exposed and unexposed individuals for every confounder combination in the data-which is testable empirically. Parametric regression methods can easily mask violations of this assumption. Consistency refers to the assumption that a subject’s observed outcome reflects her counterfactual outcome given her observed exposure history. This assumption is untestable, but is fundamental to our understanding of the causal relationship. No model misspecification is very broad and refers to all models specified in the analysis, including the weight (or treatment) model and the final structural model.10 This assumption is untestable, but by specifying different models one can evaluate how robust the findings are to different model specifications. Unlike MSM, DRE relaxes this assumption by requiring that only either the treatment or outcome model be correctly specified,4 Correct temporal ordering is a key assumption for causal inference and one that has remained important since the early days of modern epidemiology.21 No analytical method can overcome fundamental design limitations from cross-sectional data; if the direction of the relationship between exposure and outcome cannot be determined, a causal relationship cannot be inferred.
In the presence of time-dependent confounding and effect measure modification, MSM and DRE may not be appropriate and several other methods are available to the analyst.3 To illustrate this situation, we return to the DAG proposed by Sudan et al, and assume that within different levels of B the relationship between X2 and Y2 is modified. Methods to handle this analytic scenario include, but are not limited to, structural nested models, with parameters estimated using g-computation,22–24 and artificially censoring subjects using inverse probability of exposure weighting,25 Didactic explanations of analyses using g-computation are available.14, 26
Longitudinal data with repeated measurements, such as those encountered by Sudan et al using the Danish National Birth Cohort Study, can lead to complex casual research questions. DAGs are a useful tool for identifying covariates necessary for adjustment to control for bias, laying the foundations for performing sensitivity analyses, and determining when conventional regression models may not be appropriate. Causal inference models, such as MSM and DRE, can be appropriate methods for analyses involving cumulative or joint effects of exposures over time. It is important to consider the assumptions necessary for causal inference from these models and situations where these models may be inappropriate. Clear and defined research questions are needed to guide the analysis and determine the time and place for causal inference methods.
Acknowledgments
This work was supported by the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health.
Footnotes
Conflict of interest:
None declared.



