Crit Rev Food Sci Nutr. 2010 Dec; 50(s1): 13–16.
Published online 2010 Dec 4. doi:  10.1080/10408398.2010.526842
PMCID: PMC3024843

Causation in the Presence of Weak Associations

Despite their observational nature, epidemiologic studies have been used to make inductive inferences about the causes of human diseases. In this context, I mainly consider the term “cause” in its cognitive (explanatory) meaning, that is, by detecting causal factors and identifying mechanisms of diseases.

The development of a theoretical framework for the establishment of causation in the absence of experimental evidence represents an important conceptual development in the interpretation and explanation of biological phenomena. The framework is based on a combination of convergent lines of evidence, none of which is sufficient per se to establish a cause-effect relationship. The criteria proposed by Hill (1965), which he derived from the list prepared in the US Surgeon General's report Smoking and Health (1964), to interpret the epidemiologic results on lung cancer risk from tobacco smoking, have been used as a paradigm for causation in observational epidemiology. Despite the fact that Hill emphasized the importance of other factors in the causal inferential process, these criteria continue to be used in the interpretation of epidemiologic studies (see Table 1 for a summary).

Table 1
Guidelines for causality in epidemiologic studies, according to Hill (1965)

The strength of the association is one of the original criteria that have been maintained in all of the subsequent formulations. The observation of a strong statistical association between a suspected risk (or a protective) factor and a condition or disease, typically determined by a measure of the incidence (or prevalence) of the condition among the exposed relative to that among the unexposed (often loosely defined as “relative risk”), adds credibility to its causal nature (Rothman et al., 2008a). The modern interpretation of this criterion, which has an instinctive appeal, is that chance, bias, and unmeasured confounding are less likely to explain (or at least to completely explain) a relative risk that is further away from the null. Although any measure of risk would follow a continuous distribution and there are no predefined values that separate “strong” from “moderate” or “weak” associations, relative risks below 3 are considered moderate or weak (Wynder, 1987).

Most of the carcinogens identified in the early decades of cancer epidemiologic research were characterized by strong associations with at least one type of cancer (Table 2). However, it has become clear in recent decades that known carcinogenic exposures explain only a proportion of human cancers (Boffetta et al., 2009), and it is unlikely that many strong carcinogens exist that have not yet been discovered. However, biological agents might represent an exception as demonstrated by the human papilloma virus, whose strong carcinogenic role on the uterine cervix and other genital organs was demonstrated in epidemiologic studies in the early 1990s (International Agency for Research on Cancer, 1995). It is therefore plausible to expect a relatively weak effect (if any) of suspected carcinogenic agents.

Table 2
Carcinogenic agents identified in an early report of the World Health Organization (1964)

The study of weak associations magnifies the three major methodological problems faced by observational research: chance, bias, and confounding.

Epidemiological research typically relies on the evaluation of the role of chance (random error) in generating the observed results (Rothman et al., 2008b), which is typically performed by applying the appropriate tests to assess whether the probability of obtaining the observed results, under the null hypothesis, exceeds a given value. The probability depends, in addition to the level of significance chosen, on the magnitude of the observed effect and the total number of observations, and their repartition between the different groups (e.g., cases and controls, exposed and unexposed). For a given level of statistical significance, the number of subjects to be included in the study is inversely correlated with the magnitude of the effect to be detected.

Bias (systematic error) is a violation of the internal validity of a study because of factors related to study design, data acquisition, data analysis, and results reporting (Rothman et al., 2008c). In epidemiology, typical examples of bias are the lack of comparability of study groups (e.g., cases and controls selected from two different populations), lack of comparability of exposure or outcome information (e.g., cases prone to overreporting past exposure as compared with controls), and selective publication of studies showing an effect (or the lack of an effect). In general, the effect of bias on the risk estimate is expected in many instances to be modest because investigators or the reviewers of the manuscript would easily identify extreme sources of bias (e.g., a comparison between older African women and younger European men). Therefore, relative risks close to the null are more likely to be generated by bias than by more extreme results.

Confounding is a specific type of bias that originates from an undetected causal relationship between the determinant, the outcome of interest, and a third (unmeasured) factor that is causally related to the outcome and associated with the determinant (Rothman et al., 2008b). For example, an association between drinking alcohol and the incidence of lung cancer can be explained by the fact that (in many populations) drinkers tend to smoke more frequently and at higher doses (a cause of lung cancer) than nondrinkers. The magnitude of the confounding effect depends on the strength of the association between the confounder and the determinant (e.g., How much more do drinkers smoke compared to nondrinkers?) and on the strength of the association between the confounder and the outcome (e.g., What is the risk of lung cancer among smokers compared to nonsmokers?). As in the general case of bias discussed above, confounding would more easily generate a weak (spurious) association than a strong one.

Epidemiology, therefore, faces the challenge of identifying the weak causal associations that require large study populations and taking special care to exclude bias and confounding, and it is not surprising that the evidence on which these associations are based is often challenged. The case of lung cancer risk from exposure to second-hand smoke among nonsmokers is a good example of the difficulty in establishing the causal nature of a weak association. Early epidemiologic studies date from the early 1980s that showed an increased risk of lung cancer among nonsmoking women married to smokers as compared with nonsmoking women married to nonsmokers. Since then, a large body of evidence has accumulated that, by and large, consistently shows a weak overall association between various measures of exposure to second-hand smoke and lung cancer risk with limited evidence of a dose relationship (see Boffetta, 2002 for a cumulative meta—analysis that shows how the evidence has accumulated over time). The excess risk among individuals who are exposed for a relatively long duration of time is in the order of 25%, as compared with individuals who are exposed at a background level (e.g., nobody is completely unexposed to second-hand smoke, although the situation might change in the future because of the ban of smoking in public settings). Researchers have had to include several hundred cases and controls in studies on this issue because the risk they seek to detect is small; in addition, this is a difficult endeavor because nonsmokers represent only a small fraction of total lung cancer cases. Furthermore, given the strong association between active smoking and lung cancer (smokers’ risk is 20 or more times higher than that of nonsmokers), a relatively weak association is sufficient between active smoking and exposure to second-hand smoke (e.g., a few self-reported nonsmokers exposed to second-hand smoke are in fact smokers) to generate the observed effect. Epidemiologic studies have been commanded to demonstrate that such misclassification did not occur (e.g., via biological markers of smoking). Other sources of bias that have been evoked to explain the observed association include—a higher propensity of nonsmokers with lung cancer to report past exposure as compared with healthy controls, the presence of other confounders (e.g., a healthier diet of nonsmokers living in smoke-free environments), and selective underreporting of results of studies that do not show an association.

Recent authoritative panels, including those assembled by the IARC Monographs and the US Surgeon General, have concluded that the causal nature of the association is established beyond a reasonable doubt, and the reader is referred to the original reports for a detailed discussion of the various sources of bias and confounding (IARC, 2004; United States Department of Health and Human Services, 2004). What is important here is to notice that this is one of the few weak associations that have been accepted by the majority of cancer epidemiologists, largely because of the vast literature along several lines of evidence.

Weak associations are important in all areas of epidemiology, and I address two of them here: genetics and diet. In the last decade, it has become clear that the search for high-penetrance genes, whose variants confer a very high excess risk to carriers, has not been yielding important results as it did in previous years. The most likely reason for this phenomenon is the same as that described above for environmental epidemiology: (nearly) all of the high-risk genes have been identified, and the unexplained genetic susceptibility to cancer is likely because of a large number of variants, each with a modest effect on cancer risk. Bias and confounding are a smaller problem in genetic epidemiology, but the challenge of assembling and testing very large groups of cases and controls remains, and has been successfully met in recent consortial efforts (e.g., see a recent work by Loos et al., 2008).

In the case of diet and cancer, the results of early studies, mainly of case-control design, pointed toward the existence of relatively strong associations between certain components of diet and cancer risk. However, during the last decade, the analysis of prospective studies, which are less prone to bias but may include populations with limited exposure contrast, has mainly resulted in weak (or null) associations. This is shown by comparing the evaluations of the evidence between fruit and vegetable intake and cancer risk from the World Cancer Research Fund in 1997 and in 2007; with a few notable exceptions, the strength of the evidence for these associations was judged to be weaker in the second report as compared with the first one (World Cancer Research Fund, 1997, 2007). Although it could be argued that recent studies might have underestimated the effect of diet on cancer risk, it is also likely that most of the associations to be identified in relationship to diet are of small magnitude.

In conclusion, cancer epidemiology is, to a large extent, the determination of small effects and weak associations, and poses major challenges that are easier to overcome in certain areas (e.g., genetic epidemiology) than in others (e.g., environmental or nutritional epidemiology). Identifying the causal nature of a weak association is not impossible, but requires large, well-planned, and well-conducted studies and supporting evidence from molecular and experimental studies.

REFERENCES

  • Boffetta P. Involuntary smoking and lung cancer. Scand. J. Work Environ. Health. 2002;28(Suppl 2):30–40. [PubMed]
  • Boffetta P., Tubiana M., Hill C., Boniol M., Aurengo A., Masse R., Valleron A. J., Monier R., de Thé G., Boyle P., Autier P. The causes of cancer in France. Ann. Oncol. 2009;20:550–555. [PubMed]
  • Hill B. The environment of disease: association or causation? Proc. R. Soc. Med. 1965;58:295–300. [PMC free article] [PubMed]
  • International Agency for Research on Cancer (IARC) Human Papillomaviruses. Vol. 64. Lyon, France: IARC; 1995. IARC Monographs on the Evaluation of Carcinogenic Risks to Humans.
  • International Agency for Research on Cancer (IARC) Tobacco Smoke and Involuntary Smoking. Vol. 83. Lyon, France: IARC; 2004. IARC Monographs on the Evaluation of Carcinogenic Risks to Humans. [PMC free article] [PubMed]
  • Loos R. J., Lindgren C. M., Li S., Wheeler E., Zhao J. H., Prokopenko I., Inouye M., Freathy R. M., Attwood A. P., Beckmann J. S., Berndt S. I., Prostate Lung, Colorectal Ovarian (PLCO) Cancer Screening Trial. [PMC free article] [PubMed]
  • Jacobs K. B., Chanock S. J., Hayes R. B., Bergmann S., Bennett A. J., Bingham S. A., Bochud M., Brown M., Cauchi S., Connell J. M., Cooper C., Smith G. D., Day I., Dina C., De S., Dermitzakis E. T., Doney A. S., Elliott K. S., Elliott P., Evans D. M., Sadaf Farooqi I., Froguel P., Ghori J., Groves C. J., Gwilliam R., Hadley D., Hall A. S., Hattersley A. T., Hebebrand J., Heid I. M., KORA Lamina C., Gieger C., Illig T., Meitinger T., Wichmann H. E., Herrera B., Hinney A., Hunt S. E., Jarvelin M. R., Johnson T., Jolley J. D., Karpe F., Keniry A., Khaw K. T., Luben R. N., Mangino M., Marchini J., McArdle W. L., McGinnis R., Meyre D., Munroe P. B., Morris A. D., Ness A. R., Neville M. J., Nica A. C., Ong K. K., O'Rahilly S., Owen K. R., Palmer C. N., Papadakis K., Potter S., Pouta A., Qi L. Nurses’ Health Study; Randall J. C., Rayner N. W., Ring S. M., Sandhu M. S., Scherag A., Sims M. A., Song K., Soranzo N., Speliotes E. K. Diabetes Genetics Initiative; Syddall H. E., Teichmann S. A., Timpson N. J., Tobias J. H., Uda M. SardiNIA Study; Vogel C. I., Wallace C., Waterworth D. M., Weedon M. N. Wellcome Trust Case Control Consortium. Willer C. J., FUSION Wraight, Yuan X., Zeggini 16 P. BOFFETTA E., Hirschhorn J. N., Strachan D. P., Ouwehand W. H., Caulfield M. J., Samani N. J., Frayling T. M., Vollenweider P., Waeber G., Mooser V., Deloukas P., McCarthy M. I., Wareham N. J., Barroso I., Jacobs K. B., Chanock S. J., Hayes R. B., Lamina C., Gieger C., Illig T., Meitinger T., Wichmann H. E., Kraft P., Hankinson S. E., Hunter D. J., Hu F. B., Lyon H. N., Voight B. F., Ridderstrale M., Groop L., Scheet P., Sanna S., Abecasis G. R., Albai G., Nagaraja R., Schlessinger D., Jackson A. U., Tuomilehto J., Collins F. S., Boehnke M., Mohlke K. L. Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat. Genet. 2008;40:768–775. [PMC free article] [PubMed]
  • Rothman K. J., Greenland S., Poole C., Lash T. L. Causation and causal inference. In: Rothman K. J., editor; Greenland S., editor; Lash T. L., editor. Modern Epidemiology. 3rd edition. Philadelphia, PA: Lippincott Williams & Wilkins; 2008a. pp. 5–31.
  • Rothman K. J., Greenland S., Lash T. L. Precision and statistics in epidemiologic studies. In: Rothman K. J., editor; Greenland S., editor; Lash T. L., editor. Modern Epidemiology. 3rd edition. Philadelphia, PA: Lippincott Williams & Wilkins; 2008b. pp. 148–167.
  • Rothman K. J., Greenland S., Lash T. L. Validity in epidemiologic studies. In: Rothman K. J., editor; Greenland S., editor; Lash T. L., editor. Modern Epidemiology. 3rd edition. Philadelphia, PA: Lippincott Williams & Wilkins; 2008c. pp. 128–147.
  • United States Department of Health, Education and Welfare. Smoking and Health: Report of the Advisory Committee to the Surgeon General of the Public Health Service. Washington, DC: Government Printing Office; 1964.
  • United States Department of Health and Human Services. The Health Consequences of Smoking: A Report of the Surgeon General. Washington, DC: Government Printing Office; 2004.
  • World Cancer Research Fund. Food, Nutrition and the Prevention of Cancer: A Global Perspective. Washington, DC: American Institute for Cancer Research; 1997.
  • World Cancer Research Fund/American Institute for Cancer Research. Food, Nutrition, Physical Activity, and the Prevention of Cancer: A Global Perspective. Washington, DC: American Institute for Cancer Research; 2007.
  • World Health Organization. Prevention of Cancer. 1964. Report of a WHO Committee, Technical Report Series 276. WHO, Geneva, Switzerland.
  • Wynder E. L. Workshop on Guidelines to the Epidemiology of Weak Associations: Introduction. Prev. Med. 1987;16:139–141. [PubMed]