Send to

Choose Destination
Cancer Causes Control. 2002 Nov;13(9):813-23.

Classification tree analysis: a statistical tool to investigate risk factor interactions with an example for colon cancer (United States).

Author information

Genetic Epidemiology, Department of Medical Informatics, University of Utah, and Genetic Research, Intermountain Health Care, Salt Lake City 84108, USA.



Classification tree analysis is a potentially powerful tool for investigating multilevel interactions. Within the context of colon cancer etiology it may help identify disease pathways and evaluate important interactions of risk factors.


We apply classification tree analysis as a statistical method to investigate interactions of risk factors for colon cancer. We use data collected from a population-based case-control study of newly diagnosed cases of colon cancer (N = 4403 cases and controls).


Our results indicate that, as expected, there are many factors that influence colon cancer risk, and that they interact on many levels. We find that the most important factor is the utilization of aspirin and/or non-steroidal anti-inflammatory drugs (NSAID), with those taking this medication having lower risk. Family history appears as a level two modifying factor when NSAID are not used, whereas Western diet is the second factor when NSAID are taken. The final tree has six levels, contains several modifying factors and correctly classifies case or control status for 60.8% (95% CI 59.4-62.2) of all individuals.


Our results suggest that risk factors work together to determine disease risk. By accounting for interactions between risk factors we become better able to dissect disease pathways and determine those risk factors that increase susceptibility to disease. Our results highlight the importance of designing studies so that interactions can be addressed.

[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center