EModern Epidemiologic Approaches to Interaction: Applications to the Study of Genetic Interactions

Schwartz S.

Publication Details


Epidemiology attempts to discern the causes of disease through an analysis of the patterns of exposure/disease relationships that are brought into view by our study designs. The types of designs and methods that are developed are largely influenced by the health challenges that the population faces as well as any methodologic and technological constraints.

Current epidemiologic methods were sparked by the rise of chronic diseases that did not fit well within the causal models underlying infectious disease epidemiology. Infectious disease models, based on the Henle-Koch principles, reserved the term “cause” for factors that were both necessary and sufficient for disease occurrence. Although this assumption did not apply strictly to the identified causes of many infectious diseases, this model worked well enough to provide utility over time.

A crisis arose, however, over the study of the relationship between smoking and lung cancer. Although the association between smoking and lung cancer was strong and seemed persuasive, smoking clearly was neither necessary nor sufficient for the development of lung cancer. This led to a paradigmatic crisis that over time resulted in the development of a new framework for the identification of causes, which crystallized as “risk factor epidemiology.” This framework is rooted in the notion that there are multiple pathways to the same disease and that within each pathway there are multiple causes that work in tandem to lead to the disease. These types of causes are often referred to as “risk factors.”

The risk factor framework generally is “egalitarian” in its assumptions about causation; all types of factors that contribute to disease occurrence can be called a cause. There may be some factors that are necessary causes in the sense that the disease never occurs in their absence, but other causes may not be necessary at all. In addition, even necessary causes require the presence of causal partners to lead to disease occurrence. These causal partners also are considered to be causes of the disease.

The necessity of a causal partner for disease occurrence is what we mean by “biologic interaction.” Thus, the very definition of a cause in risk factor epidemiology places the issue of interaction front and center. It is assumed that virtually all diseases arise from the interaction of two or more causes.

Despite the centrality of interaction to this causal framework, methodologic advances have focused mainly on the isolation of single causes and the identification of individual risk factors that contribute to disease occurrence in a population. New designs were developed to allow us to see the relationships between exposures1 and disease in our data that would provide clues to the identification of these causes. Statistical methods were developed to aid in causal inference.

The identification of the causal partners of particular risk factors, the assessment of interaction, was a more complex notion that awaited conceptual clarification and methodological advances. Considerable progress has been made; however, often a lag occurs between the development of new methods and approaches and their application and appearance in the literature. Thus, the way in which interaction is assessed in epidemiologic studies is only now beginning to reflect these newer methods.

What follows is a discussion of this newer way of thinking about how to identify “biologic interaction.” I prefer the use of the term “synergy” in this discussion because it is more neutral to the level of organization at which interaction is being described. Although these methods have developed separately from those in the field of genetics, they are fully applicable to the field, and while genes have characteristics that are distinct from many of the risk factors studied in epidemiology, an epidemiologic approach to causation easily and naturally encompasses genes as causes. However, this application requires a shift in perspective. From a genetic point of view there is a hierarchy of causes, with “the gene” having centrality as the defining cause and all other factors being ancillary to it. Factors that are considered equal causes from an epidemiologic frame are sometimes labeled in genetics in a way that gives them secondary status. One example is the use of the term “phenocopy” to distinguish a case of disease caused in the absence of a putative genetic cause. Another example is the concept of “reduced penetrance.” This term refers to the inexact relationship between a genotype and a phenotype and implies that this slippage is a characteristic of a gene; the gene evidences “reduced penetrance,” or the gene is “fully penetrant.” From an epidemiologic perspective, reduced penetrance is simply a normal characteristic of all causes—the lack of a one-to-one relationship between causes and diseases due to interaction. From an epidemiologic perspective, reduced penetrance is not a characteristic of the gene, but rather a characteristic of the distribution of the causal partners with which the gene works to cause disease. It is the natural state of most causal relationships.

Thus epidemiologic approaches to interaction provide an exciting perspective on genetic concepts that may shed new light on genetic issues. Likewise, the integration of genetic thinking into epidemiology can advance methodology.

I begin this task with a discussion of why the assessment of interaction is so problematic, and then I will discuss the current epidemiologic resolution to the problem. However, to fully understand the solution and its applicability to a genetic context, we need to probe the concept of causation in epidemiology more fully. Although this may seem a bit off topic, it is central to understanding the elements of the new ways of thinking about synergy. Finally, more specific problems of application and design will be addressed.


The Problem

Because the testing of our hypotheses and the assessment of our data rely on statistical tools, we already are most familiar with the concept of statistical interaction. From a statistical perspective, we can say that there is interaction when in the presence of two factors the outcome occurs more frequently than would be expected based on the independent effects of each factor. By independent effect, we mean the effect of one factor in the absence of the other factor. To make this more concrete, we would say that interaction can be identified when among people with both a genetic variant and an environmental exposure the disease rate is higher than would be expected if the genetic factor and environmental exposure each worked independently.

Although this definition is clear in statistical terms, it begs the question of “what would be expected.” As it turns out, what would be expected depends on the effect measure or statistical model used to express the relationship between exposures and disease. This can be seen in the data from a study in psychiatric epidemiology that proved to be very enlightening in this regard. Brown and Harris (1978) wanted to test the theory that stressful life events and problems with intimacy interacted in causing depression. They hypothesized that, while both stressful life events and intimacy problems each may confer a risk of depression, when they are both present they confer a greater risk than would be expected if each worked through a separate causal pathway. The data derived from a study to test this hypothesis are depicted in Figure E-1.

FIGURE E-1. Assessment of interaction: example from Brown and Harris (1978).


Assessment of interaction: example from Brown and Harris (1978).

Brown and Harris interpreted these data as supporting their claim for an interaction between intimacy problems and stressful life events. The risk of depression in those with neither stressful life events nor intimacy problems was 1 percent, while among those with only stressful life events was 10 percent, and among those with only intimacy problems was 3 percent. The difference in the risk conferred by stressful life events alone was therefore 9 percent (10 percent −1 percent), and the risk difference conferred by intimacy problems alone was 2 percent (3 percent −1 percent). If there were no interaction, one would expect that when both factors were present the risk conferred would be 11 percent (9 percent + 2 percent). However, the data show that the risk conferred when both were present was 32 percent, which is substantially greater than would be expected based on the independent effects of each risk factor. Brown and Harris therefore concluded that these data supported their theory of an interaction between stressful life events and intimacy problems in causing depression.

Tennant and Bebbington (1978) challenged this conclusion. They reanalyzed these data using log linear modeling. This analysis calculated the effects on a different scale by calculating risk ratios. Using this model, life events acting alone increase the risk of depression by a factor of 10 (10 percent = 1 percent * 10). Intimacy problems alone increased the risk by a factor of 3 (3 percent = 1 percent * 3). Therefore, based on this calculus one would expect the co-presence of these risk factors to increase the effect by a factor of 30 (1 percent * 3 * 10) if they were acting independently of each other. This is very close to the 32 percent risk actually found. Thus, Tennant and Bebbington concluded from these same data that there was no support for Brown and Harris’s conclusion.

What was not fully appreciated at the time was that both Brown and Harris and Tennant and Bebbington provided absolutely correct interpretations of the data based on the unarticulated statistical assumptions of their approaches. Brown and Harris, using risk differences to express the effects of risk factors, used a model that implicitly assumed that, absent interaction, risks add in their effects. They used an additive model. Tennant and Bebbington, on the other hand, analyzed the data using a log linear model that implicitly assumed that absent interaction, risks multiply in their effects. They used a multiplicative model. Thus, based on statistical definitions of interaction the same data both did and did not support a theory of interaction.

This state of affairs is disconcerting, to say the least. We depend on our data and statistical tools to give us a rough estimate of the state of affairs in the real world, and it is problematic when the answers to our questions differ depending on the statistical model we use to assess our data. To make matters worse, the choice of statistical model often is based on statistical considerations. For example, we usually employ additive models, such as linear regression, when our outcome variables are continuous. When our outcomes are dichotomous, as they frequently are in genetic and epidemiologic contexts, we employ logistic regression, because such outcomes violate the statistical assumptions of linear regression models. Although this choice meets statistical requirements, it shifts us to a multiplicative model. Linear regression assumes that risks add in their effects, and thus interaction is indicated by an appreciable deviation from additivity (i.e., sub- or superadditivity). Logistic regression assumes that risks multiply in their effects, and thus interaction is indicated by an appreciable deviation from multiplicativity (i.e., sub- or supermultiplicativity).

The problem is that if both risk factors have an effect, there always will be interaction on at least one of these scales. As illustrated in Figure E-2, additivity implies submultiplicativity, and multiplicativity implies superadditivity. Thus, except in instances of supermultiplicativity (in which both models will index positive interaction) and subadditivity (in which both models index negative interaction), the answer to the question of whether or not there is interaction will depend on the statistical model that we choose.

FIGURE E-2. Relationship between additive and multiplicative interaction.


Relationship between additive and multiplicative interaction.

This is very unsettling, because we want our statistical models to repre sent our concepts rather than having them define our concepts. So, the question would be one of what model best represents the “true” relationship between risk factors. That is, do risk factors really add or multiply in their effects? Darroch (1997), Rothman and Greenland (1998), and others have grappled with this problem. It appears that the additive model with a twist best represents what we mean by interaction. The twist is due to redundancy in causes, as we shall see.

To appreciate this argument, and to assess its applicability to the context of assessing interactions that include genetic factors, a fuller discussion of the causal model on which this assessment is based is necessary. This casual model—the counterfactual or potential outcomes model—developed in philosophy and statistics (Mackie, 1974; Maldonado and Greenland, 2002; Rubin, 2004; Shadish et al., 2002) underlies much of the causal thinking today in epidemiology and allied fields such as history, sociology, and economics.

The solution to the interaction problem derives from the application of this causal model to synergy. The advantage of this approach is obvious. It provides a way to assess what we mean by interaction conceptually and asks what mathematic representations support our concepts, rather than providing a statistical model and then contorting our concepts to fit the requirements of that model. Whether or not you agree that Darroch’s solution is correct, this approach toward the solution seems reasonable.

The Underlying Causal Model

The counterfactual or potential outcomes model underlies many current developments in epidemiologic methods. Although at first blush it sounds intimidating, this way of thinking about causes echoes simple notions that we apply in everyday circumstances. Nonetheless, its articulation has many interesting implications for causal thinking and is an immensely useful tool for grappling with difficult design decisions, and, as we shall see, for assessing the relationship between our conceptual and statistical tools.

From a counterfactual perspective, a cause is any factor without which the disease event would not have occurred, at least not when it did, given that all other conditions are fixed (Greenland and Robins, 1986; Maldonado and Greenland, 2002; Rothman and Greenland, 1998). Note that this defines causation at the level of the individual, with the definition applying to individual disease events.

The counterfactual way of thinking is familiar to all of us when we second guess our actions and think about what would have happened had we taken a different action. We compare what happened to what would have happened had we made a different choice. Similarly, when we try to make a decision about how to act in the future, we often imagine the outcome under alternative sets of actions. We compare what we think would happen under one action with what we think would happen under a different action.

We also use this type of thought experiment to conceptually separate co-occurrences that are coincidental from those that are causal. So, for example, if a teakettle whistles and then the doorbell rings, we do not assign causality to the teakettle’s whistle, because we think that the doorbell would have rung even without the teakettle whistling, assuming all else remained the same. This is the essence of causation from a counterfactual perspective.

Rothman (1976) has developed a heuristic based on this definition of a cause—referred to as causal pies—that provides a useful framework for understanding the implications of this approach.

In this heuristic, the causes of each disease event are depicted by a causal pie (a circle), cut into its constituent pieces. Each piece of the pie represents an exposure that contributes to the occurrence of the disease event. When all of the pieces of the pie are present, the disease occurs. Thus, each pie represents a sufficient cause of disease that is comprised of component causes each of which are necessary for the completion of this sufficient cause of disease. For example, as depicted in Figure E-3, there are three posited causal pathways to this disease outcome; individuals can get this disease from sufficient causes 1, 2, or 3. For sufficient cause 1 to occur, an individual must be exposed to components A, B, and C. If any one of the components is missing, the pie will not be complete and disease will not occur, at least not through this mechanism.

FIGURE E-3. Rothman’s causal pies.


Rothman’s causal pies.

Thus, each component in the pie is a cause according to the counterfactual definition, because given that all else is fixed (i.e., all of the causal partners are in place), if we remove component A, for example, the outcome would not have occurred.

Thus, from this perspective, biologic interaction is the relationship between two factors in the same causal pie. In more technical language, biologic interaction occurs when one risk factor allows the other to be expressed in a disease outcome. I prefer to refer to this process as synergy (a term also favored in the epidemiologic literature), because two factors may have causal effects when they influence each other on some level of organization other than the biologic. In the Brown and Harris example above, the interaction between stressful life events and intimacy problems in causing depression might be considered “psychologic interaction.” Of course, these psychological factors need to have biologic consequences to cause disease, but the joint effects occur at the psychological level. The counterfactual perspective and Rothman’s causal pies are neutral to the level of organization under discussion.2

Thus, for example, A could be a genetic mutation and B one of the environmental factors that stimulates synthesis of a detrimental gene product, or A and B could be two genes that interact to cause disease. Therefore, several concepts that are distinguished from one another in genetics (e.g., epistasis, gene-environment interaction) would be considered to be the same phenomenon in epidemiology. Note that because all of the components in the same casual pie interact in this way, when we ask about interaction, we always must specify the particular components for which we are assessing interaction.

What becomes apparent from this model is that the effect of an exposure depends on the presence of its causal partners. Thus, A will have an effect if and only if its causal partners B and C are present. In contexts in which the causal partners are ubiquitous, the exposure will have a huge effect, since the conditions that activate it always will be present. In contexts in which the causal partners are absent, the exposure will have no effect. In genetics, the classic example used to illustrate this point is phenylketonuria (PKU). The genetic variant that causes PKU has a huge effect in societies in which phenylalanine is a ubiquitous part of the human diet, but a small effect in those in which it is not. Thus, the effect of the “PKU gene” depends on the prevalence of its causal partners. Indeed, intervention on the causal partner is the way in which we largely prevent the deleterious effects of this genetic variant.

As noted above, causes are defined for the individual who gets the disease, which makes sense because the disease occurs in the body of the individual. However, although we use individuals as the units of our analysis, we cannot draw conclusions about the units, but only about the average of the units. Thus, the causal effect (also called the causal contrast) is indexed by the difference between the proportion of exposed people who got the disease at a particular moment in time and the proportion of these same people who would have gotten the disease at that particular moment in time had the exposure not occurred, all things being equal (Mackie, 1974; Rothman and Greenland, 1998), as illustrated in Figure E-4.

FIGURE E-4. Causal effect (causal contrast).


Causal effect (causal contrast).

Estimating Causal Effects from a Counterfactual Perspective

It is apparent that although we can observe the amount of disease that exposed people experience, we cannot observe the amount of disease that they would have experienced during that same period had they not been exposed. We cannot see both the “fact” (the exposure and disease state of a person) and the “counterfactual” (the disease state under the condition of nonexposure). The counterfactual is, by definition, counter to the facts and therefore not visible. This is a reiteration of the central problem in disease etiology—that causation is not observable. We can see the co-occurrence of exposures and disease, but causation itself cannot be observed, it can only be inferred.

Since we cannot observe the counterfactual state, we select a group of unexposed people as a substitute, or proxy, for the unobservable counter-factual. This substitute gives us the “correct answer” (i.e., represents the true casual effect) to the extent that it is a good proxy. What we mean by a good proxy is that the disease proportion (i.e., disease risk) in this group of unexposed people represents the disease risk the exposed would have had had they not been exposed (i.e., the counterfactual risk).

For the unexposed to be a good proxy, the exposed and the unexposed should be equal on all causes of disease other than the exposure of interest. When this occurs, the exposed and unexposed are said to be “exchangeable.” A lack of exchangeability—that is, when the disease risk in the unexposed does not equal that of the exposed had they not been exposed— is what we mean by confounding. When there is confounding, we cannot see whether the exposure had an effect or not. However, assuming exchangeability, or assuming that the unexposed are a good proxy for the counterfactual, the difference in the disease risk between the exposed and unexposed provides an index of the effect of the exposure.

We will discuss this issue of confounding in a bit more detail in order to more fully understand the implications of the counterfactual approach for interaction. This simpler scenario, in which we are attempting to identify the causal effect of a single exposure, will ease the discussion of the application to the more complex scenario of synergy.

Suppose we have a disease such as depression, whose sufficient causes are depicted in Figure E-5. Our hypothesis is that A (perhaps some genetic variant) is a cause of depression. We assume that A has causal partners, which are unidentified but indicated in this model by B. Note that B is simply a stand-in for all of the factors that must be present for A to have an effect. We also assume that there are other pathways to the disease that do not include A. We will note all these other causal pathways by a causal pie with X. X is neither a single exposure nor a single causal pathway. Rather, X is a stand-in for all combinations of exposures that lead to disease that do not include A. Another complication is that it is possible for A to prevent disease in some situations. If so, this means that some people have a combination of exposures (depicted by Q) that require the absence of A to get the disease.

FIGURE E-5. Hypothetical example—causes of depression.


Hypothetical example—causes of depression.

If we consider causation under the counterfactual model, we can imagine what would happen to people with different causal partners if they were exposed to the risk factor under investigation—A in this instance. These potential outcomes are depicted in Figure E-6.

FIGURE E-6. Exposure of interest A: Potential outcomes of people with different causal partners.


Exposure of interest A: Potential outcomes of people with different causal partners.

People exposed to X will get the disease if they are exposed to A or not exposed to A (i.e., under the counterfactual they will get disease as well). The exposure does not cause the disease for these people, since even without the exposure they would have gotten it. We label these people Type 1, Doomed.3 The word is a little stronger than the meaning implied. It simply means that during the period under consideration these people will get the disease under study with or without the exposure of interest. Types are also not inherent characteristics of people; rather, they are a categorization of people by the causal partners (i.e., all risk factors other than those under study) to which they have been exposed by the end of the study period.

People with B will get the disease if they are exposed but not if they are not exposed (i.e., under the counterfactual they will not get the disease). We call these people Type 2, Causal Types (i.e., the exposure under investigation is causal for them). When we ask the question, “Is A a cause of disease?” what we really want to know is whether there are any Causal Types in the population.

People with exposure Q will not get the disease if they are exposed, but under the counterfactual, if they were unexposed, they would get the dis ease. For these people the exposure has an effect, but it is preventive. They are called Type 3, Preventive Types. People who do not have B, X, or Q are labeled Type 4, Immune Types. Regardless of their exposure to A, they will not get the disease.

Thus, the true causal effect of the disease is indicated by the proportion of exposed people who get the disease compared with the proportion of exposed people who would have gotten disease without the exposure. This is the proportion of Doomed and Causal Types in the population compared with the proportion of Doomed and Preventive Types.

But types are unobservable. They represent the potential disease outcomes under the exposed and unexposed circumstances. If we could discern a person’s type, we would know how he or she got the disease and know whether or not the exposure under study is, in fact, a cause for him or her. What can be observed is the disease experience of a cohort of people under one of the two conditions, either exposed or unexposed, but not both. Thus, if we take a cohort of people who are exposed, we can assess their actual disease experience, but not their counterfactual disease experience (i.e., what the disease proportion would have been among them had they not been exposed).

We use a comparison of the proportion of disease in an exposed and unexposed group as our best representation of the causal contrast. For the purposes of this discussion, I will hereafter assume that the unexposed are a good proxy for the counterfactual (i.e., the exposed and unexposed are exchangeable, and there is no confounding or bias of any type). Exchange-ability also can be understood in terms of the types: it means that the distribution of types is the same in the exposed and unexposed cohorts (or, more specifically, the proportions of Doomed and Protective Types are the same in the two cohorts).

It should be noted again that the causal contrast of a group of people indexes the average effect of the exposure. If the exposure can have both causal and preventive effects, then our measures tell us only whether or not there are more people for whom the exposure is causal in the population than people for whom the exposure is preventive. For example, the risk difference is the difference in the proportion of Causal (Type 2) and Preventive Types (Type 3) in the population, and the risk ratio is the ratio of Types 1 and 2 (Doomed and Causal) to Types 1 and 3 (Doomed and Protective). If we can assume that the exposure can have only a causal effect, and never a preventive effect, then the difference in these proportions tells us the proportion of people for whom the exposure is, in fact, causal.

In sum, any time that we calculate a risk ratio or a risk difference, we are using the data we have on patterns of exposure/disease relationships to infer something about the types of people in the population. In particular, we make inferences about the presence and proportion of Causal Types (Type 2’s) in the population. This is the basis for causal inference in epidemiology.

Darroch (1997) and Rothman and Greenland (1998) built on this insight to assess the particular mathematical representations that would arise in our data if, in fact, there were people in the population who got the disease because of the biological interaction of two particular exposures. From our definition of a cause, this has a particular meaning: There is biologic interaction (synergy) if and only if there are some people in the population who got the disease because they were exposed to both the exposures under consideration and who would not have gotten the disease otherwise. By “otherwise,” I mean they would not have gotten the disease if they were exposed to only one of the two exposures or if they were exposed to neither.

Estimating Synergy Under a Counterfactual Perspective

For purposes of exposition, I will assume that we are interested in the hypothesis that a particular genetic polymorphism (Gene A) interacts with a particular toxin (Environmental Exposure B) to cause a particular disease outcome. We can think about this as a genetic variant that makes the body’s cells unable to clear some toxin from the system. We do not assume that this is the only route to the disease.

One limitation to the detection of synergy is that we must assume that Gene A and Environmental Toxin B are not causal in some people and protective in others. In other words, to make the detection of synergy possible at all, we must assume that this particular genetic polymorphism and this particular environmental toxin can only cause damage and are never protective (although they may be neutral).

We can then conceptualize the causal pies that would depict the causes of disease from the point of view of interest in both Gene A and Environmental Exposure B and, in particular, in their interaction. There would then be four sufficient causes for this disease; one sufficient cause requires both A and B and their causal partners, one requires A and its causal partners, another B and its causal partners, and, finally, one requires neither A nor B (see Figure E-7). Our interest in synergy means that we want to know if there are any people, and, if so, how many, who got the disease from the first sufficient cause.

FIGURE E-7. Sufficient causes of an outcome: interest in the interaction of Gene A and Environmental Exposure B.


Sufficient causes of an outcome: interest in the interaction of Gene A and Environmental Exposure B.

Based on this model, there are different response types to our two exposures of interest. People who are exposed to W will get the disease if and only if they are exposed to both Gene A and Environmental Exposure B. These are the Synergistic Types whose presence or absence in the population we want to detect. People with X will get the disease if they are exposed to A, regardless of their exposure to B. They are the Gene A Susceptible Types. Likewise, people exposed to Y will get the disease if they are exposed to B, regardless of exposure to A. They are the Environmental Exposure B Susceptible Types. People exposed to Z, the Doomed, will get the disease whether or not they are exposed to A or B; they will get the disease from a sufficient cause that does not include A or B in the causal pathway. The Immune Type will not get disease by the end of the study, regardless of their exposure to A or B. Finally, there is another type, more recently discovered, that provides the “twist” alluded to above—people who have both X and Y and who will get the disease if they are exposed to either A or B. Such people are called Parallel Types.

One can think of the four right-hand columns of the figure as different, exchangeable cohorts of people. By exchangeable, we mean that the distribution of types across the cohorts is the same. That is, if 20 percent of the cohort exposed to A is Doomed, then 20 percent of the cohort exposed to B also is Doomed. This is simply an expansion of the “no confounding” assumption described earlier in the context of the detection of single causes.

One cohort is exposed to A only, one to B only, one to both A and B, and one to neither. The notation, diseased or not disease, in each column indicates the disease outcome for the type described in each row under each exposure condition. For example, row 2, the Gene A Susceptible Type, will get the disease if exposed to A or if exposed to A and B, but not otherwise. It should be noted that if exposed to A and B, the causal effect for this type is still only A.

What we want to know are the types of people (or the proportion of each type) in the population. But types are not visible. The only thing that is visible is the pattern of exposure for the disease experience; we can see the proportion of each exposure group that gets the disease. We want to use these patterns of exposure disease associations to identify the proportion of each type (and in particular the proportion of synergistic types) in the population.

By looking down the four right-hand columns in Figure E-8, we can easily identify the proportion of two types in the population—the Immune and the Doomed. The Doomed are represented by the proportion of people who get the disease among the cohort that is exposed to neither A nor B. Similarly, the Immune are the proportion of people exposed to both A and B who do not get the disease.

FIGURE E-8. Assessing interaction between Gene A and Environmental Exposure.


Assessing interaction between Gene A and Environmental Exposure.

We also can see from Figure E-8 the types that contribute to disease under each exposure condition, which are summarized in Figure E-9. Among those exposed to both A and B, the Synergistic, Doomed, A Susceptible, B Susceptible, and Parallel types all get disease and thus contribute to the proportion of diseased people (i.e., the risk) in this exposure cohort. Among those exposed only to Gene A, the Doomed, A Susceptible Types, and Parallel Types contribute to the risk; among those exposed to B only, the Doomed, B Susceptible, and Parallel Types contribute to the risk; and among those exposed to neither A nor B, only the Doomed contribute to the risk.

FIGURE E-9. Relationship between observed risk (proportion with disease) and unobserved types for assessing synergy.


Relationship between observed risk (proportion with disease) and unobserved types for assessing synergy.

We can see the risk (the proportion diseased) in each exposure group for which we provide specific labels. R12 is the risk (the proportion diseased) among those exposed to both A and B; R1 is the risk for those exposed only to A; R2 the risk for those exposed only to B; and R the baseline risk (i.e., the risk among those exposed to neither A nor B). We can now translate the proportion diseased (the risk) we observe under each exposure category into the underlying types that contribute to the risk in each exposure category.

Using basic mathematical tools, we attempt to isolate Synergistic types from the others. The closest we can come is the isolation of the balance between Synergistic and Parallel Types. The proportion of (synergistic − parallel) types in the population = R12 −R1 −R2 + R.4 This is the additive model (R12 −R) − (R1 −R) − (R2 − R)5 that assumes risks add in their effects, with the twist that parallelism makes the relationships somewhat less than additive. Thus, if the risk of disease among those exposed to both factors is more than the sum of the risk differences for each factor alone, there is evidence of Synergistic Types in the population. This is evidence that Gene A and Environmental Factor B work in a synergistic way to cause disease for at least some people.

Note, however, that we cannot definitively state what proportion of the disease is due to synergy. We can only say that the proportion of Synergistic Types is greater than the proportion of Parallel Types. In addition, perfect additivity is compatible with either no Synergistic Types in the population or a perfect balance of Synergistic and Parallel Types. Just as in the simple case of identifying single causes, we only identify the average risk—that is, the preponderance of causal over protective effects of an exposure—so too in the face of parallelism, we cannot rule out synergy if we find less than superadditivity, but we do find support if there is superadditivity.

It is important to note the constraints on this conclusion. First, this analysis makes all of the usual assumptions that apply in the way we currently conduct research; it assumes such things as independence of outcomes between units and no feedback loops. Second, it makes the important assumption that the exposures under consideration express either synergy or antagonism, but not both; it is assumed that a risk factor has only a casual effect or a preventive effect, but not both. How realistic this assumption is depends on the exposures under consideration. In psychology, this assumption is often unrealistic. For example, there may be parenting practices (such as strict discipline) that would be beneficial for children with one type of temperament, but detrimental for children with another. In genetics, the “norm of reaction,” where a genetic factor has positive or negative effects depending on the context (Levins and Lewontin, 1985), could violate this assumption. However, this is simply a recognition that under these circumstances there are too many unknowns for any of our traditional mathematical models to handle. These caveats notwithstanding, Darroch’s argument begins with the conceptual model and then brings us to the mathematical model that represents synergy most closely.

Applications in Practice

The conclusion drawn from these analyses is that synergy is indexed by deviations from additivity. In practice then, how do we estimate synergy using this approach? One method is to calculate an “interaction contrast” (Rothman and Greenland, 1998). To illustrate how this is done, I will use an example based on the interaction between a serotonin transporter gene polymorphism and life stress in causing depression, as reported from the Dunedin birth cohort (Caspi et al., 2003). The hypothesis was that there is a synergistic relationship between a short “s” allele and multiple stressful life events in causing depression.

As illustrated in Figure E-10, the disease prevalence among those with neither the susceptible genotype nor life events was 10 percent; among those with only the susceptible genotype, 10 percent; among those with only life events, 17 percent; and among those with both life events and the susceptible genotype, 33 percent. In this instance the interaction contrast would be .33 − .17 − .10 + .10 = .16. The interaction contrast thus equals the risk among those with both factors (.33), minus the risk among those with one (.17), minus the risk among those with the other (.10), plus the baseline risk (.10). Since the interaction contrast here is greater than zero (.16), it indicates the presence of synergy in this population.

FIGURE E-10. Estimation of the interaction contrast.


Estimation of the interaction contrast.

In this example, the risks required for the computation were directly provided by the report. However, in a cohort study we can compute the interaction contrast, regardless of the form in which the results are analyzed and presented. Suppose we analyzed the data under a logistic regression model. The baseline odds of disease would be derived from the intercept. The odds ratios from the logistic regression then would be used to obtain the odds of disease under the other conditions. Finally, the odds would be converted to risks (odds = p/1 − p).

When we cannot estimate the baseline risk of disease, as in a case-control study, we can calculate an interaction contrast ratio using the odds ratios computed from a logistic regression analysis. The interaction contrast ratio is the odds ratio for those with both factors, minus the odds ratio for those with one factor, minus the odds ratio for those with the other factor, plus one. For illustration, I computed the odds ratios for the Dunedin study from the prevalence estimates given in Figure E-10. The baseline odds of depression among those with neither the “s” allele nor stressful life events are .11 (.10/1.10). The odds for those with both factors are .49 (.33/ 1.33); for those with only the “s” allele they are .11 (.10/1.10), and for those with only life event they are .20 (.17/1.17). Therefore the odds ratios would be 4.4 for those with both factors, 1.8 for life events alone, and 1 for the “s” allele alone. The interaction contrast ratio in this context would be 4.4 − 1.8 − 1 + 1 = 2.6. Since the interaction contrast ratio is greater than 0, this indicates the presence of Synergistic Types in the population. Several methods have been developed to calculate p values and confidence intervals around these estimates (see, e.g., Assmann et al., 1996; Hosmer and Lemeshow, 1992; Rothman and Greenland, 1998).

Although this is the understanding of synergy that is accepted in the methodologic literature, it has begun to filter down into actual research articles only recently (e.g., Li et al., 2005; Olshan et al., 2001; Rauscher et al., 2003; Shen et al., 2005). It is interesting to note that many of these articles assess gene-environment interactions. However, this model of assessing synergy is applicable to genetic interactions only to the extent that the underlying counterfactual causal model is applicable.


Applicability of a Counterfactual Approach to a Genetic Context

The counterfactual approach requires a thought experiment in which we hold everything constant and manipulate the exposure to see what the outcome would be under this new condition. The causal contrast—the index of the true effect of the exposure—is the difference between what was, given the exposure, and what would have been had the exposure been altered but everything else remained constant. Because this thought experiment requires the consideration of an alteration in the exposure and nothing else, the applicability of a counterfactual approach to nonmanipulable exposures has been questioned (e.g., Kaufman and Cooper, 1999). Since, currently, genes are not easily manipulable, this might open the question of the applicability of this approach to the consideration of genetic effects. In a similar vein, some have argued that personal characteristics, such as age, gender, ethnicity, and social class, should not be considered as causes because they are not manipulable.

However, others (Shadish et al., 2002; Susser and Schwartz, 2005) argue that the counterfactual can apply to nonmanipulable causes, although their detection is more difficult. Nonmanipulable causes cannot be randomly assigned to rule out the many potential sources of nonexchangeability between the exposed and unexposed group that cause confounding. Nonetheless, at the least, one can conduct the thought experiment and search for, or design, studies that approximate the thought experiment as closely as possible.

In addition, what is nonmanipulable today may, in the future, become manipulable. The use of animal “knock-out models” clearly indicates the possibility of genetic manipulation and, with increasing knowledge, even when the gene itself is not manipulable the active ingredients of the gene vis-à-vis the disease, the gene product, may be manipulable.

In the final analysis, it seems that in genetic studies in which people are compared who do and do not have a particular gene variant, or who do or do not have a proxy for a genetic predisposition (e.g., family history), the comparison only makes sense if there is some underlying notion of a causal contrast underlying it. The association may not reflect causation due to the nonexchangeability of the exposed and unexposed, but the logic of the methods assumes that barring such methodological problems, the contrast would imply a causal contrast. Otherwise, why do we use such methods to try to detect causes? The counterfactual approach is merely the clear articu lation of the framework that supports the logic that underlies all of our study designs.

Primacy of the Genetic Effect

The egalitarian assumptions regarding causation constitute another possible objection to the application of this approach to a genetic context. As discussed above, from a counterfactual perspective, genes, behaviors, and the external environment share equally in the appellation “cause.” There is no hierarchy of enabling factors and triggers versus the “real cause.” This view is in contrast to genetic approaches that see the gene as the central actor, with all other “causes” playing a supporting role. However, this should not be a significant impediment to the application of epidemiologic approaches to interaction. There are many possible approaches to its resolution. First, one can impose a hierarchy on this approach by declaring a genetic factor to be a necessary cause of the outcome and by defining the phenotype based on the genetic component. As discussed above, there is nothing in this approach that precludes a cause that is found in every causal pie (i.e., in every causal pathway to disease). Of course, if the genetic factor is known to be a necessary cause, the detection of interaction is simplified. In such an instance, one would look for the main effects of a hypothesized causal partner among those with the genetic factor. But even if the genetic factor is not necessary, one could give it prominence by referring to the causal pies that do not contain the genetic effects as phenocopies. Similarly, in discussing the interaction between a genetic and an environmental cause, one can refer to the environmental factor as triggering a genetic effect. These interpretational preferences would not be inconsistent with a counterfactual approach.

On the other hand, the counterfactual approach also suggests that there may be benefits, under some circumstances, to dismantling the hierarchy. That is, one can often describe a gene-environment interaction equally well as the interaction between an environmental factor and a genetic vulnerability that allows the environmental factor to be expressed or as an interaction between a genetic factor and an environmental context that allows the gene to be expressed. The counterfactual approach points out the symmetry of interaction.

Application to Study Designs Used to Detect Gene-Environment Interactions

Many of the study designs used to detect gene-environment interactions are indistinguishable from those used to detect interactions between environmental or other nongenetic factors. Cohort studies and case-control studies and their variants are prominent designs in general and genetic epidemiology (Hunter, 2005). Thus, the statistical models used to analyze the data—linear regression, logistic regression, Cox proportional hazards models, and Poisson regression—are used in both fields. The problems and arguments discussed above therefore apply directly.

There are other study designs, however, that have been developed specifically for the assessment of genetic exposures—for example, familial aggregation studies, twin studies, and the case-only design. The problem of the model dependence of interaction applies to these situations as well. In each instance, the data are analyzed using a model that makes some assumption about how independent effects influence risk and therefore about how interaction is indicated. Even case-only studies, which assess gene-environment interactions without the use of controls, make such an assumption. This design is predicated on a multiplicative model. Thus, case-only studies are also conservative if we think that synergy is best indicated by deviations from additivity (Gatto et al., 2004). Twin studies are perhaps the most problematic for assessing interaction, since the genetic and environmental factors are not measured. Their effects are derived from the pattern of results, which often have to assume the absence of interaction to be interpretable. To the best of my knowledge, the basic problem of the model dependence of measures of synergy is not solved by the use of specific genetic designs.


One of the advantages of the counterfactual approach is that it illuminates a central problem of causal inference: it is uncertain. Causality is an unobservable construct that leaves footprints in the real world that are open to misinterpretation. It is important to note that the counterfactual approach does not cause these problems, but rather articulates them and thus forces us to confront them. But forewarned is forearmed. Once we recognize the reality of the uncertainty and subjectivity of causal inference, we can think about the factors that exacerbate and mitigate these uncertainties and design our studies and analyses accordingly. This approach also should warn us against demanding more of our data than they can provide and against interpreting our data beyond their inherent limitations.

The assessment of synergy is no exception. Our data can provide us with evidence that is consistent or inconsistent with synergy, but they can never provide definitive evidence for or against it. Each study has its own strengths and weaknesses. The most productive approach is to consider all of the extant evidence, consider our uncertainties about the data, and then design new studies that confront those uncertainties directly. No one study will provide us with an answer, but carefully designed studies that directly confront alternative hypotheses will move us toward greater clarity.

All of the threats to validity that apply to the detection of single causes apply to the detection of synergy. Some become even more salient. I will, therefore, only briefly touch on some of the issues that were specifically raised in the mandate for this paper.


Power, the ability to detect an association of a designated magnitude when it exists in the population, is a problem in all studies, but it is one that is particularly problematic for detecting synergy. Power is based on three factors: how well variables are measured, how large the true effect is in the population, and the sample size. It follows, therefore, that we can increase power by measuring our variables well, looking for effects that are large (or looking for them where they are large), and conducting studies with sufficient numbers of people.

The genomic revolution should improve power because the genetic effect is more clearly and closely measured. When family history of a disease, for example, is used as a proxy measure for a genetic effect, the bias toward that null that derives from measurement error is enormous. A true genetic effect of 50 can look like a genetic effect of 2 or less, depending on the prevalence of the outcome and other factors (Zimmerman, 2003). Thus, measuring actual genetic markers decreases measurement error and increases power.

In detecting gene-environment interactions, accurate measurement of the environmental factor is equally important. The more clearly articulated the hypothesis, the more carefully the measures can be chosen, and the more power there will be to detect an effect. Vague theories about gene-environment interactions will be more likely to lead to poor construction of measurable variables and therefore decreased ability to detect synergistic effects. However, it should be noted that measurement error also can masquerade as interaction as well as mask it. Thus, false positive as well as false negative results can be produced by poor measurement.

Power also can be enhanced by looking for situations or populations in which the interaction is strong. As discussed in earlier sections of this paper, the effect of an exposure depends not only on its biologic effects, but on the prevalence of its causal partners and the number of sufficient causes in which it is not a partner. Therefore, the same biologic effect will be easier to detect in situations in which the other sufficient causes are rare and the causal partners are common. Along these lines, one suggestion for enhancing power regarding main effects is to look for the effect of an exposure in a group in which the outcome is rare (Rothman and Poole, 1988). This would enhance power because the base rate of disease in the unexposed group would be low. Similarly, to detect specific gene-environment interactions for a particular outcome, looking for populations in which the outcome is less common may help. In these situations, the same biologic effects will produce a larger risk ratio.

Power also should be a consideration in the choice of study design. For the same number of people, case-control studies will in general provide more power when the outcome is rare, and cohort studies will provide more power when the exposures are rare.

But whatever the choice, sample sizes need to be sufficient. Articulating hypotheses in advance has the added advantage of providing the basis for more accurate power estimates. However, methods for estimating power are less developed for synergy than they are for main effects, although some work has been done on proper power analyses for both additive and multiplicative interaction (e.g., De Gonzalez and Cox, 2005; Greenland, 1983). What is clear is that the detection of interaction requires considerably larger sample sizes than the detection of the exposures’ main effects.

Multiple Comparisons

Multiple comparisons may be a particular problem regarding interaction because researchers are less likely to hypothesize them in advance. This raises the concern that we will increase the number of Type I errors in our studies; we will frequently reject the null in error. Some have suggested an adjustment to our alpha levels (e.g., Bonferroni adjustments) to take multiple comparisons into account. To fully address this issue requires a detailed discussion of the meaning of p values and confidence intervals, which is beyond the scope of this paper. I will, however, touch on some issues to consider.

The use of adjustments to the alpha level to correct for multiple comparisons reifies the p value and potentially contributes to a misuse of null hypothesis testing. Null hypothesis testing tells us the probability of our data if the null is true. What we really want to know is the probability that the null is true, given our data. Unfortunately, these two probabilities are not the same. There is a tendency, however, to treat significant results as though they told us the latter rather than the former. In addition, p values do not strictly apply in the context of observational studies because the statistical premises on which they are constructed are often violated in nonexperimental settings. For both reasons, the use of confidence intervals rather than p values is preferred. Confidence intervals provide a rough estimate of the precision of our data. Wide confidence intervals tell us that our data do not provide much information about the effect. Narrow confidence intervals suggest that our data are more precise. Of course, there may be confounding and other biases reflected in our estimates, but the association is more trustworthy. The use of confidence intervals, with a statement of the number of comparisons made, provides more information for the reader to decide how seriously to take the results of a study. But nothing solves the problem of multiple comparisons. Data that are consistent with well-formulated hypotheses that are developed in advance of the study provide better evidence than data that result from studies for which the hypotheses are developed after the fact.

Population Stratification

From an epidemiologic perspective, population stratification is simply confounding; the exposed and unexposed may differ for reasons other than the exposure under study. One advantage of genetic epidemiology is that the confounders of genetic associations are limited, and the more carefully specified and measured the genetic factor, the more limited the potential sources of nonexchangeability. For example, if you measure a genetic effect by a family history of the outcome, the exposed and unexposed may differ on a large number of factors other than the exposure of interest, which is a genetic effect. However, if you measure the exposure as a particular genetic variant or marker, it becomes less likely that there will be nonexchangeability of the exposed and unexposed on other causes of disease beyond what would occur by chance.

Population stratification is simply confounding that arises because the groups with unequal distributions of a particular genetic variant also have unequal distributions of other risk factors for the disease. The problem is exacerbated in case-control studies for which the selection of cases and controls can create population stratification even when it does not exist in the naturally occurring populations that gave rise to the cases, as is true with most problems of confounding. In a cohort study, population stratification would be more easily detected and controlled.

To the extent that population stratification is a problem in studies of single exposures, it will be a problem in studies of synergy. Just as single studies require the exposed and unexposed cohorts to be exchangeable regarding all causes of the disease other than the exposure of interest, so too the assessment of synergy requires that all four exposure cohorts (exposed to both factors, each of the two alone, and neither) be exchangeable regarding all causes of disease other than the two under investigation.


Epidemiologic approaches to biologic interaction have benefited from a full articulation of the underlying causal assumptions of risk factor epidemiology. The counterfactual or potential outcomes approach clarifies methodologic principles and provides a guide for methodologic choices. This starting point suggests that risks add in their effects. Therefore, synergy is best indicated by deviations from an additive rather than a multiplicative model, with a twist. I think that this model applies as well to genetic causes as it does to the environmental and behavioral causes that are more frequently examined in traditional epidemiologic contexts. It has the added advantage of clarifying and unifying other genetic constructs, providing a basis for understanding confounding in general and population stratification in particular, and providing a bridge between genetic and risk factor epidemiology. Although this approach has limitations, the transparency of its conceptual basis makes the limitations transparent as well. It brings the limitations inherent in all forms of causal inference to light, making them more amenable to amelioration.


Ann Madsen reviewed and provided helpful comments on this paper. Many of the ideas and examples in this paper derive from Schwartz and Susser (in press) “Causal Explanation Within a Risk Factor Framework,” Chapter 35, in: Susser, Schwartz, Morabia, and Bromet, Psychiatric Epidemiology: the Search for Causes of Mental Disorders, Oxford University Press.


  1. Assmann SF, Hosmer DW, Lemeshow S, Mundt KA. Confidence intervals for measures of interaction. Epidemiology. 1996;7:286–290. [PubMed: 8728443]
  2. Brown GW, Harris T. Social origins of depression: a reply. Psychological Medicine. 1978;8:577–588. [PubMed: 724871]
  3. Caspi A, Sugden K, Moffitt TE, et al. Influence of life stress on depression: moderation by a polymorphism in the 5-HTT gene. Science. 2003;301:386–389. [PubMed: 12869766]
  4. Darroch J. Biologic synergism and parallelism. American Journal of Epidemiology. 1997;145:661–668. [PubMed: 9098184]
  5. De Gonzalez AB, Cox DR. Additive and multiplicative models for the joint effect of two risk factors. Biostatistics. 2005;6:1–9. [PubMed: 15618523]
  6. Gatto NM, Campbell UB, Rundle AG, Ahsan H. Further development of the case-only design for assessing gene-environment interaction: evaluation of and adjustment for bias. International Journal of Epidemiology. 2004;33(5):1014–1024. [PubMed: 15358745]
  7. Greenland S. Test for interaction in epidemiologic studies: a review and a study of power. Statistics in Medicine. 1983;2:243–251. [PubMed: 6359318]
  8. Greenland S, Robins JM. Identifiability, exchangeability, and epidemiological confounding. International Journal of Epidemiology. 1986;15:413–419. [PubMed: 3771081]
  9. Hosmer DW, Lemeshow S. Confidence interval estimation of interaction. Epidemiology. 1992;3:452–456. [PubMed: 1391139]
  10. Hunter DJ. Gene-environment interactions in human diseases. Nature Reviews. 2005;6:287–298. [PubMed: 15803198]
  11. Kaufman JS, Cooper RS. Seeking causal explanations in social epidemiology. American Journal of Epidemiology. 1999;150:113–120. [PubMed: 10412955]
  12. Levins R, Lewontin R. The Dialectical Biologist. Cambridge, MA: Harvard University Press; 1985.
  13. Li Y, Millikan RC, Bell DA, Cui L, Tse CJ, Newman B, Conway K. Polychlorinated biphenyls, cytochrome P450 1A1 (CYP1A1) polymorphisms, and breast cancer risk among African American women and white women in North Carolina: a population-based case-control study. Breast Cancer Research. 2005;7:R12–R18. [PMC free article: PMC1064095] [PubMed: 15642161]
  14. Mackie JL. Cement of the Universe. Oxford, England: Clarendon Press; 1974.
  15. Maldonado G, Greenland S. Estimating causal effects. International Journal of Epidemiology. 2002;31:422–429. [PubMed: 11980807]
  16. Olshan AF, Weissler MC, Watson MA, Bell DA. Risk of head and neck cancer and the alcohol dehydrogenase 3 genotype. Carcinogenesis. 2001;22:57–61. [PubMed: 11159741]
  17. Rauscher G, Sandler DP, Poole C, Pankow J, Bloomfield CD, Olshan AF. Is family history of breast cancer a marker of susceptibility to exposures in the incidence of de novo adult acute leukemia? Cancer Epidemiology, Biomarkers and Prevention. 2003;12:289–294. [PubMed: 12692102]
  18. Rothman KJ. Reviews and commentary: causes. American Journal of Epidemiology. 1976;104:587–592. [PubMed: 998606]
  19. Rothman KJ, Greenland S. Modern Epidemiology. 2nd ed. Philadelphia, PA: Lippincott-Raven; 1998.
  20. Rothman KJ, Poole C. A strengthening programme for weak associations. International Journal of Epidemiology. 1988;17:955–959. [PubMed: 3225112]
  21. Rubin DB. Direct and indirect causal effects via potential outcomes. Scandinavian Journal of Statistics. 2004;31:161–170.
  22. Shadish WR, Cook TD, Campbell DT. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston, MA: Houghton Mifflin; 2002.
  23. Shen J, Gammon MD, Terry MB, Wang L, Wang Q, Zhang F, Teitelbaum SL, Eng SM, Sagive SK, Gaudet MM, Neugut AI, Santella RM. Polymorphisms in XRCC1 modify the association between polycyclic aromatic hydrocarbon-DNA adducts, cigarette smoking, dietary antioxidants, and breast cancer risk. Cancer Epidemiology, Biomarkers and Prevention. 2005;14:336–342. [PubMed: 15734955]
  24. Susser E, Schwartz S. Are social causes so different from all other causes? A comment on Sander Greenland. Emerging Themes in Epidemiology. 2005;2:4. [PMC free article: PMC1187907] [PubMed: 15913455]
  25. Tennant C, Bebbington P. The social causation of depression: a critique of the work of Brown and his colleagues. Psychological Medicine. 1978;8:565–575. [PubMed: 724870]
  26. Zimmerman R. Familial aggregation study designs: causes of discrepancies in case-control and reconstructed cohort effect estimates. 2003 Ph.D. Dissertation: Columbia University, AAT 3071403.



This paper uses the term “exposure” to mean any factor that is being examined to see if it is a cause of disease. The term applies to any factor under consideration—genetic or environmental.


The caveat to this is that an antecedent and a mediator cannot be considered simultaneously, because under that circumstance each component would not be necessary for the pie to form. The pies cannot contain redundant “slices.” There is also an affinity for individual-level variables from the causal pie schema, but it can accommodate levels below and above the individual.


In this paper I will, in general, use terminology from the original sources to allow easy translation when consulting the original texts. Sometimes the terminology is confusing or can be misinterpreted. In those instances, I will try to clarify the terms, but not invent new ones.


If we take the types that contribute to disease among those exposed to both A and B (the first box in Figure E-9), subtract from them those that contribute in the second box, subtract from them those that contribute in the third box, and then add those in the fourth box, we are left with (synergy − parallel). The Synergistic Types appear in only one box, so they cannot be canceled out, and the Parallel Types occur in three boxes, so their cancellation leaves the Parallel Type. All other types cancel out in this formula.


R12 − R1 − R2 + R = (R12 − R) − (R1 − R) − (R2 − R). In the absence of synergy, the risk difference for those with both factors (R12 − R) will simply equal the risk difference for factor A (R1 − R) + the risk difference for factor B (R2 − R). Thus in the absence of synergy and parallelism, or a balance of synergy and parallelism, (R12 − R) − (R1 − R) − (R2 − R) = 0.