• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of ajhgLink to Publisher's site
Am J Hum Genet. Mar 2006; 78(3): 464–479.
Published online Jan 31, 2006. doi:  10.1086/500848
PMCID: PMC1380289

Bladder Cancer Predisposition: A Multigenic Approach to DNA-Repair and Cell-Cycle–Control Genes


The candidate-gene approach in association studies of polygenic diseases has often yielded conflicting results. In this hospital-based case-control study with 696 white patients newly diagnosed with bladder cancer and 629 unaffected white controls, we applied a multigenic approach to examine the associations with bladder cancer risk of a comprehensive panel of 44 selected polymorphisms in two pathways, DNA repair and cell-cycle control, and to evaluate higher-order gene-gene interactions, using classification and regression tree (CART) analysis. Individually, only XPD Asp312Asn, RAG1 Lys820Arg, and a p53 intronic SNP exhibited statistically significant main effects. However, we found a significant gene-dosage effect for increasing numbers of potential high-risk alleles in DNA-repair and cell-cycle pathways separately and combined. For the nucleotide-excision repair pathway, compared with the referent group (fewer than four adverse alleles), individuals with four (odds ratio [OR] = 1.52, 95% CI 1.05–2.20), five to six (OR = 1.81, 95% CI 1.31–2.50), and seven or more adverse alleles (OR = 2.50, 95% CI 1.69–3.70) had increasingly elevated risks of bladder cancer (P for trend <.001). Each additional adverse allele was associated with a 1.21-fold increase in risk (95% CI 1.12–1.29). For the combined analysis of DNA-repair and cell-cycle SNPs, compared with the referent group (<13 adverse alleles), the ORs for individuals with 13–15, 16–17, and [gt-or-equal, slanted]18 adverse alleles were 1.22 (95% CI 0.84–1.76), 1.57 (95% CI 1.05–2.35), and 1.77 (95% CI 1.19–2.63), respectively (P for trend = .002). Each additional high-risk allele was associated with a 1.07-fold significant increase in risk. In addition, we found that smoking had a significant multiplicative interaction with SNPs in the combined DNA-repair and cell-cycle–control pathways (P<.01). All genetic effects were evident only in “ever smokers” (persons who had smoked [gt-or-equal, slanted]100 cigarettes) and not in “never smokers.” A cross-validation statistical method developed in this study confirmed the above observations. CART analysis revealed potential higher-order gene-gene and gene-smoking interactions and categorized a few higher-risk subgroups for bladder cancer. Moreover, subgroups identified with higher cancer risk also exhibited higher levels of induced genetic damage than did subgroups with lower risk. There was a significant trend of higher numbers of bleomycin- and benzo[a]pyrene diol-epoxide (BPDE)–induced chromatid breaks (by mutagen-sensitivity assay) and DNA damage (by comet assay) for individuals in higher-risk subgroups among cases of bladder cancer in smokers. The P for the trend was .0348 for bleomycin-induced chromosome breaks, .0036 for BPDE-induced chromosome breaks, and .0397 for BPDE-induced DNA damage, indicating that these higher-order gene-gene and gene-smoking interactions included SNPs that modulated repair and resulted in diminished DNA-repair capacity. Thus, genotype/phenotype analyses support findings from CART analyses. This is the first comprehensive study to use a multigenic analysis for bladder cancer, and the data suggest that individuals with a higher number of genetic variations in DNA-repair and cell-cycle–control genes are at an increased risk for bladder cancer, confirming the importance of taking a multigenic pathway-based approach to risk assessment.

The candidate-gene approach is hypothesis driven, uses a priori knowledge of SNP and gene functions, and has yielded sometimes informative but often conflicting data in cancer association studies. In many studies where a significant association is reported, the odds ratios (ORs) for individual variants are <2 (Goode et al. 2002; Neumann et al. 2005). There are innumerable instances in which association studies have been unable to replicate an initial positive candidate-gene finding. Among the reasons for this lack of replication are small sample size, inadequate statistical methods, and failure to evaluate the effect of multiple pathophysiologically related genes (Horne et al. 2005). The low risk conferred by an individual polymorphism is not surprising, given that carcinogenesis is usually a multistep, multigenic process, and it is unlikely that any one single genetic polymorphism would have a dramatic effect on cancer risk. Therefore, single-gene studies are likely to provide limited value in predicting risk. A pathway-based genotyping approach, which assesses the combined effects of a panel of polymorphisms that interact in the same pathway, may amplify the effects of individual polymorphisms and enhance the predictive power. Several recent small-scale multigenic studies provide evidence of the promising potential of applying such a pathway-based multigenic approach in association studies (Han et al. 2004; Popanda et al. 2004; Cheng et al. 2005; Gu et al. 2005). In this article, we use bladder cancer as the cancer prototype and focus on two relevant physiologic pathways, DNA repair and cell-cycle control, to illustrate our theme.

Bladder cancer is the malignancy with the fifth highest incidence in the United States, with ~63,210 new cases in 2005 (Jemal et al. 2005). Cigarette smoking is the predominant risk factor and is estimated to be responsible for half the cases in men and for a third in women. Occupational exposure to carcinogens is the second major risk factor. Despite the overwhelming evidence that most bladder cancers are attributable in part to environmental carcinogenic exposures, only a fraction of exposed individuals actually develop bladder cancer, and the working hypothesis is that there are also predisposing genetic factors (Shields and Harris 2000; Wu et al. 2004).

DNA damage repair and cell-cycle checkpoints are two primary defense mechanisms against mutagenic exposures. There are four major DNA-repair pathways in human cells: mismatch repair, nucleotide-excision repair (NER), base-excision repair (BER), and double-strand–break (DSB) repair (Christmann et al. 2003). The NER pathway mainly removes bulky DNA adducts typically generated from exposure to polycyclic aromatic hydrocarbons in tobacco smoke. The BER pathway is responsible for removal of oxidized DNA bases that may arise endogenously or from exogenous agents. The DSB pathway is responsible for repairing double-strand breaks caused by a variety of exposures, including ionizing radiation, free radicals, and telomere dysfunction. There are two distinct and complementary pathways for DSB repair—namely, homologous recombination (HR) and nonhomologous end joining (NHEJ). Cell-cycle checkpoints are regulatory pathways that control the order and timing of cell-cycle transitions to ensure the fidelity of critical events such as DNA replication and chromosome segregation (Elledge 1996). Cells may be arrested at any of the checkpoints, temporarily halting the cell cycle and allowing DNA repair to be completed. Checkpoint loss and perturbation of cell-cycle control results in genomic instability and is a hallmark of cancer, as evidenced by the frequent inactivation of cell-cycle–control genes, including p53, p16, and Rb, in various cancers (Hanahan and Weinberg 2000).

There is considerable interindividual variation in DNA-repair capacity (DRC) and strong evidence that reduced DRC is associated with increased cancer risk (Berwick and Vineis 2000; Spitz et al. 2003; Wu et al. 2004). Polymorphisms in DNA-repair genes are hypothesized to be a contributor to this individual DRC variation (Mohrenweiser et al. 2003). There have been numerous studies, often with conflicting results, assessing the associations of polymorphisms in DNA-repair and cell-cycle genes with cancer risk on the basis of the hypothesis that individuals with “adverse” genotypes that result in reduced DRC or perturbed cell-cycle control are at a higher risk of developing cancer than the general population (Goode et al. 2002; Wu et al. 2004; Neumann et al. 2005).

In this study, we applied a pathway-based multigenic approach to examine the associations of a comprehensive panel of polymorphisms in DNA-repair and cell-cycle genes with bladder cancer risk. We selected 13 SNPs from the NER pathway, 8 SNPs from the BER pathway, 8 SNPs from the HR pathway, 5 SNPs from the NHEJ pathway, and 10 SNPs from the cell-cycle–control pathway. The majority of these SNPs were selected from published association studies, and a few were chosen from dbSNP on the basis of their location (promoter or coding regions) and minor-allele frequencies (>5%). To our knowledge, this is the largest multigenic cancer association study reported. We examined the combined effects of the minor alleles and evaluated higher-order gene-gene interactions, using several statistical models. In addition, we used two functional assays assessing genetic instability to determine genotype-phenotype correlations, in an attempt to validate our analytic approach. This pathway-based multigenic approach may provide a refinement of epidemiologic profiles associated with risk.

Material and Methods

Study Subjects

This study included patients with newly diagnosed bladder cancer and age-, gender-, and ethnicity-matched control subjects. The cases were enrolled at The University of Texas M. D. Anderson Cancer Center and the Scott Department of Urology at Baylor College of Medicine between 1999 and 2003. All patients had histopathologically confirmed bladder cancer, and none had received chemotherapy or radiation before enrollment. The control subjects were healthy individuals with no prior history of cancer (except nonmelanoma skin cancer) who were recruited from Kelsey Seybold, the largest multispecialty, managed-care physician group in the Houston metropolitan area. We also excluded control subjects with chronic urinary tract disease, obstructive airway disease, and diabetes. Control subjects were matched to the case patients by age (±5 years), sex, and ethnicity. The potential control subjects were first surveyed with a short questionnaire to elicit willingness to participate in the study and to provide preliminary demographic data for matching. A Kelsey-Seybold staff member provided the questionnaire to each potential control subject during clinical registration. The potential control subjects were contacted by telephone at a later date to confirm their willingness to participate and to schedule an interview appointment at a Kelsey-Seybold clinic convenient to the participant. Informed consent was obtained from all study participants before the collection of epidemiological data and blood samples by trained M. D. Anderson staff interviewers. The response rates were 75% for the controls and 92% for the cases.

Epidemiology Data Collection

In 45-min interviews, trained M. D. Anderson staff interviewers collected data on demographics, smoking history, and family history of cancer. At the end of the interviews, 40-mL blood samples were drawn into coded heparinized tubes. Individuals who had smoked at least 100 cigarettes in their lifetimes were defined as “ever smokers.” Participants who had quit smoking at least 1 year before the study were categorized as former smokers. All participants signed informed consent agreements, and the institutional review boards of M. D. Anderson, Baylor College of Medicine, and Kelsey-Seybold Clinic, in accordance with the U.S. Department of Health and Human Services, approved the study.


Genomic DNA was extracted from peripheral blood lymphocytes by proteinase K digestion, followed by isopropanol extraction and ethanol precipitation; DNA samples were stored at −80°C. Genotyping was performed using the Taqman method with a 7900 HT sequence detector system (Applied Biosynthesis), except for XPA, XPD Asp312Asn, and p53 intron 3 polymorphism, which were genotyped using PCR-RFLP (Spitz et al. 2001; Wu et al. 2002, 2003). The primer and probe sequences for each SNP are available upon request. Amplification mixes (5 μl) contained sample DNA (5 ng), 1× TaqMan buffer A, deoxynucleotide triphosphate (200 μM), MgCl2 (5 mM), AmpliTaq Gold (0.65 units), each primer (900 nM), and each probe (200 nM). The thermal cycling conditions consisted of 1 cycle for 10 min at 95°C, 40 cycles for 15 s at 95°C, and 40 cycles for 1 min at 60°C. SDS version 2.1 software (Applied Biosystems) was used to analyze end-point fluorescence, in accordance with the allelic discrimination technique. Water control, ample internal controls, and previously genotyped samples were included in each plate to ensure accuracy of the genotyping. Positive and negative controls were used in each genotyping assay, and 5% of the samples were randomly selected and run in duplicates with 100% concordance.

Genetic Instability Index

Genetic instability was measured by two in vitro assays. Mutagen sensitivity was assessed in vitro in lymphocyte cultures by counting the number of chromatid breaks induced by exposure to either bleomycin or benzo[a]pyrene diol-epoxide (BPDE), as described elsewhere (Hsu et al. 1989; Wu et al. 1998a). In brief, blood cultures were incubated for 3 d and then exposed to either bleomycin (0.03 U/mL) for 5 h or BPDE (2 μM) for 24 h. Cells were harvested, and chromatid breaks were counted in 50 metaphases per sample and were recorded as the mean number of breaks per cell. Laboratory personnel read the slides without knowledge of the individual’s cancer status. The comet assay under alkaline conditions was performed also as described elsewhere (Schabath et al. 2003). In brief, blood cultures were challenged with 2 μM BPDE. Then, the cell culture was mixed with agarose gel and was adhered to a microscope slide. The cells were lysed by submersing the slides in freshly prepared lysis buffer for 1 h at 4°C to remove all the cellular proteins. The slides were next placed in alkali buffer (pH >12.0) to denature and unwind the DNA and to expose the alkali-labile sites. The slides were subjected to electrophoresis and were stained with ethidium bromide, then were neutralized, were fixed in 100% methanol, and were stored in the dark at room temperature until ready for analysis. Consecutive comet cells (n=50 [25 cells from each end of the slide]) were manually selected and were automatically quantified using the Komet 4.0.2 (Kinetic Imaging) imaging software attached to a fluorescent microscope. The Olive tail moment was used as the parameter for DNA damage computed by the imaging system software. The averages of the Olive tail moment for each subject were calculated for the baseline and mutagen-induced comets. All assays were performed in a blinded manner, and a single lab technician performed the entire procedure for each assay to minimize interindividual variation. Detailed scanning criteria for the comet assay have also been well established to ensure consistency. There were 877 samples with mutagen-sensitivity data and 957 samples with comet assay data. The functional assays started during the second year of recruitment, and all the samples were measured consecutively. There were no significant differences in characteristics between subjects with functional data and those without functional data.

Statistical Analysis

The χ2 and Fisher’s exact tests were used to assess patient characteristics by genotype and to compare cases with controls. Student’s t test was used to test for differences between the cases and controls for continuous variables. ORs were calculated as estimates of relative risk. Unconditional multivariable logistic regression was performed to control for possible confounding by age, sex, ethnicity, and smoking status, where appropriate. Since we found no definitive prior associations in our review of the literature between any of the variants studied and bladder cancer risk, our primary analysis treated the minor-variant allele at each locus as the “risk” allele. However, the assumption that the minor variant is associated with increased risk may not be accurate. We, therefore, also evaluated other analytic approaches. In particular, we developed a method for cross-validation. First, we evaluated the main effect associated with each minor allele in half of the data. We then reassigned the minor-allele risk status if the main effect of the minor variant was found to show a negative association with risk (i.e., OR <1), and we then re-evaluated the risk in the validation set. To increase the efficiency of this testing approach, we cross-validated 10 randomly selected samples from the original data set. The average risk of each SNP in the validation sets was used to obtain pooled estimates of risk for each allele and for sums of alleles. The SE (δ2) for the allele effect of each marker is expected to be identical and to show a covariance of 1/4 across randomly selected subsets. The mean estimate, equation M1, of the allele effect across n replicate samples is normally distributed with mean μ (the per-replicate estimate of the allele effect) and variance equation M2, where σ2ii is the variance of the allele effect and σij is the covariance, which is  1/4σ2ii. After simplification, the SE of the average-risk estimate across cross-validation sets is (1/4+3/[4n])δ2, where n is the number of cross-validation replicates.

For higher-order gene-gene interactions, classification and regression tree (CART) analysis was performed using the HelixTree Genetics Analysis Software (version 4.1.0, Golden Helix). CART is a binary recursive-partitioning method that produces a decision tree to identify subgroups of subjects at higher risk. Specifically, the recursive-partitioning algorithm in HelixTree starts at the first node (with the entire data set) and uses a statistical hypothesis-testing method—formal inference-based recursive modeling—to determine the first locally optimal split and each subsequent split of the data set, with multiplicity-adjusted P values to control tree growth. This process continues until the terminal nodes have no subsequent statistically significant splits or the terminal nodes reach a prespecified minimum size (at least 10 subjects for each terminal node in our analysis). The data were divided randomly into a learning set (90% of the data) and a testing set (10% of the data). The learning set was used to construct the tree model, and the testing set was used to internally validate the resulting tree model.


Characteristics of Subjects

There were a total of 1,484 study subjects recruited for this study. The population consisted of 1,325 whites (89.3%), 77 Mexican Americans (5.2%), 67 African Americans (4.5%), and 15 others (1.0%). Among whites, there were 696 patients with newly diagnosed, histologically confirmed bladder cancer and 629 unaffected controls. There was no significant age difference between the cases (63.91 ± 11.17 years) and the controls (62.77 ± 10.50 years) (P=.06). There were more males (78.45%) among the cases than among the controls (72.66%) (P=.014). As would be predicted, the cases had a significantly higher percentage of current (25.29%) and ever smokers (73.55%) than the controls (8.21% and 53.74%, respectively) (P<.0001). Among ever smokers, cases reported significantly higher levels of consumption of cigarettes than controls, as assessed by computing mean pack-years (42.95 vs. 28.28, P<.0001).

Risk Associated with Individual SNPs

Since ~90% of the study subjects were white, we limited all our analyses to this population. The distributions of the selected panel of SNPs in the major genes involved in DNA-repair and cell-cycle–control pathways are summarized in tables 14.4. The allele frequencies of each SNP in the cases and controls and the corresponding ORs for the heterozygous and homozygous variant genotypes and for the combined heterozygote and rare homozygote genotypes are listed. For main effects of the DNA-repair genes, individually, only carriers of the variant Asn allele of XPD, Asp312Asn (OR=1.28, 95% CI 1.01–1.62), and the variant Arg allele of the RAG1, Lys820Arg (OR=1.32, 95% CI 1.00–1.73), exhibited significantly increased risks for bladder cancer (tables 14).4). For the cell-cycle–control genes, a protective effect was found for carriers of at least one rare allele of the p53 intron 3 SNP (OR=0.72, 95% CI 0.55–0.94) (table 4). The other selected SNPs in the cell-cycle–control pathway did not show significant main effects for bladder cancer risk (table 4). With consideration of the borderline CIs and multiple comparisons performed, the impact of any individual SNP on bladder cancer risk, if it exists, would be minimal.

Table 1
NER Genotype Frequencies and Risks for Bladder Cancer
Table 2
BER Genotype Frequencies and Risks for Bladder Cancer
Table 3
HR and NEHEJ Genotype Frequencies and Risks for Bladder Cancer
Table 4
Cell-Cycle SNPs Genotype Frequencies and Risks for Bladder Cancer in Whites

Combined Analysis of Multiple SNPs

To test our hypothesis that multiple SNPs in the same pathway may have an additive effect on bladder cancer risk, we estimated the combined effect of these SNPs (table 5). For those genes with multiple SNPs assayed, only a single SNP was included in this combined analysis and others were excluded because of linkage disequilibrium. The excluded polymorphisms were XPD (K751Q), XPC (intron 9 and K939Q), ERCC6 (R1230P), XRCC1 (R194W), XRCC2 (R188H), XRCC3 (T241M and 3′ UTR), p53 (intron 6 and R72P), and STK15 (I57V). For the primary analysis, we treated the minor allele at each locus as the “adverse” allele and tallied the total number of adverse alleles for each individual. For the NER pathway, the number of adverse alleles for each individual ranged from 0 to 10. We categorized individuals on the basis of the quartile distribution of the number of adverse alleles in controls, and we set individuals with fewer than four adverse alleles as the reference group. Compared with the reference group, individuals with four (OR=1.52, 95% CI 1.05–2.20), five to six (OR=1.81, 95% CI 1.31–2.50), or seven or more adverse alleles (OR=2.50, 95% CI 1.69–3.70) exhibited significantly higher risks of bladder cancer, with a significant trend of increasing risk with increasing numbers of high-risk alleles (P for trend <.001). Each additional high-risk allele of the NER pathway gene was associated with a 21% increase in risk (95% CI 1.12–1.29) (table 5).

Table 5
Combined Effects of Minor Alleles in the NER, BER, HR, NHEJ, and Cell-Cycle Pathways

For the HR pathway, using individuals with fewer than two adverse alleles as the referent group, we noted a significantly elevated association with bladder cancer risk only for individuals with four or more adverse alleles with a similar gene-dosage trend (OR=1.70, 95% CI 1.19–2.43; P for trend .002) (table 5). Each additional HR adverse allele was associated with a 1.13-fold increase in risk (95% CI 1.03–1.24). No similar patterns were observed for the NHEJ, BER, or cell-cycle pathway polymorphisms (table 5). However, a significant gene-dosage trend was evident when all DNA-repair pathway genes were combined. Compared with the reference group (<10 adverse alleles), individuals with 10–11, 12–14, and [gt-or-equal, slanted]15 adverse alleles had ORs that increased to 1.59 (95% CI 1.07–2.37), 2.10 (95% CI 1.46–3.04), and 2.31 (95% CI 1.53–3.51), respectively (P for trend <.001) (table 5). Each additional high-risk allele carried a 1.11-fold increase in risk (95% CI 1.06–1.16). When the DNA-repair genes and cell-cycle–control genes were combined, compared with the reference group of <13 adverse alleles, individuals with 13–15 adverse alleles had a 1.22-fold increased risk (95% CI 0.84–1.76), individuals with 16–17 adverse alleles had a 1.57-fold increased risk (95% CI 1.05–2.35), and individuals with [gt-or-equal, slanted]18 adverse alleles had a 1.77-fold increased risk (95% CI 1.19–2.63). Each additional high-risk allele conferred a 1.07-fold increase in risk (95% CI 1.03–1.12) (table 5). In comparing the relative contributions of each pathway, we found the risk conferred by each additional adverse allele to be the highest for the NER pathway (OR=1.2, 95% CI 1.12–1.29), intermediate for the combined DNA-repair pathways (OR=1.11, 95% CI 1.06–1.16), and lowest for DNA-repair and cell-cycle pathways (OR=1.07, 95% CI 1.03–1.12) (table 5). These data reaffirm that the NER pathway appears to be the the most significant modulator of bladder cancer risk, with other pathways playing less-prominent roles.

We also developed a statistical method for cross-validation. First, we evaluated the main effect associated with each minor allele in half of the data set. We then reassigned the minor-allele risk status if the main effect of the minor allele was found to show a negative association with risk (i.e., OR <1) and then re-evaluated the risk in the validation set. To increase the efficiency of this testing approach, we used 10 randomly selected half-sample sets from the original data set to perform the cross-validation 10 times. The average risk of each SNP in the validation sets was used to obtain pooled estimates of risk for each allele and for sums of alleles. This cross-validation scheme confirmed the significant trend of increasing risk with increasing numbers of high-risk alleles in the NER pathway, in the entire DNA-repair pathways, and in DNA-repair plus cell-cycle pathway genes (table 5, “Validation” column).

Interaction Between Smoking and Genetic Factors

Since smoking is the predominant risk factor for bladder cancer, we also stratified our analyses by smoking status (table 6). For the NER pathway, individuals with four, five to six, and seven or more adverse alleles were compared with the referent group (fewer than four adverse alleles). There was no evidence of statistically significantly increased risks for never smokers (OR=1.45, 95% CI 0.77–2.72; OR=1.27, 95% CI 0.75–2.13; and OR=1.40, 95% CI 0.72–2.73, respectively, for the three strata). However, for ever smokers, significantly increased risks were found (OR=1.53, 95% CI 0.99–2.37; OR=2.23, 95% CI 1.50–3.33; and OR=3.37, 95% CI 2.08–5.48, respectively) (P for trend <.0001). When we combined the DNA-repair and cell-cycle–control pathways, there was again evidence of gene-smoking interaction, with elevated risks evident only among ever smokers. When all DNA-repair pathways were combined, compared with individuals with <10 adverse alleles, those with 10–11, 12–14, and [gt-or-equal, slanted]15 adverse alleles all exhibited significantly increased risks (OR=2.03, 95% CI 1.23–3.36; OR=2.90, 95% CI 1.84–4.59; and OR=3.53, 95% CI 2.08–5.99, respectively), whereas the risk was not significant in any strata for never smokers (P for trend = .45). Similarly, when we combined the DNA-repair and cell-cycle genes, no significant association was found in never smokers, but a higher number of adverse alleles conferred progressively increased risks in ever smokers (P for trend <.0001). In the logistic model, the interaction between ever smoking and the combined variant alleles was statistically significant (P<.01). Cross-validation verified that the risks conferred by the adverse alleles were evident only among ever smokers (table 6, “Validation” column). A note of caution for the lack of associations in never smokers: we had approximately twice the number of smokers as nonsmokers, which limited our statistical power to detect significant associations in never smokers. Further studies with more never smokers are needed to confirm these observations.

Table 6
Combined Effects of Minor Alleles in the NER, DNA-Repair, and Cell-Cycle Pathways, Stratified by Smoking Status

CART Analysis

CART uses a binary recursive-partitioning method that identifies subgroups of high-risk subjects and detects higher-order interactions among a large number of variables. Initial CART analyses for each specific pathway identified the XPD Asp312Asn, RAG1 Lys820Arg, and p53 intron 3 polymorphisms as the initial splits in their respective pathways (NER, HR, and cell cycle, respectively) (data not shown), consistent with their main effects observed by logistic-regression analyses. To further explore gene-gene and gene-environment interactions, we performed CART analysis incorporating both the genetic and smoking-status variables. Figure 1 depicts the resultant tree structure generated. There was an initial split on smoking status, confirming that smoking is the most important risk factor for bladder cancer. Further inspection of the CART structure suggested distinct patterns for ever smokers and never smokers. As documented in previous analyses, the NER gene polymorphisms were relevant only in smokers. This was especially significant since four NER SNPs (CCNH V270A, XPD D312N, ERCC6 M1097V, and RAD23B A249V) all appeared at early splits during the recursive-partitioning process, indicating a biologically meaningful interaction between NER and smoking. The subgroups with the highest bladder cancer risk were those smokers with variant genotypes of CCNH V270A, ERCC6 M1097V, and RAD23B A249V SNPs (node 14 and node 16). In nonsmokers, the three important SNPs were ADPRT V762A, ATM D1853N, and RAG1 K820R. These three genes are involved in DSB and BER and play critical roles in maintaining genomic stability. Table 7 summarizes the risk estimates of all the terminal subgroups compared with the subgroups with the least percentage of cases (node 1). There was a clear separation of individuals with varying bladder cancer risks. However, the estimated ORs should be interpreted with caution, since they were derived from groups identified through a data-mining technique applied to the same data set, and, thus, the level of uncertainty was underestimated.

Figure  1
Classification and regression tree analysis of all the DNA-repair and cell-cycle gene polymorphisms. High-risk groups are identified by red-colored boxes, medium-risk groups by brown-colored boxes, and low-risk groups by blue-colored boxes.
Table 7
Risk Estimates of CART Terminal Nodes

To gain insight into the potential mechanism for the increased cancer risk conferred by these different genotype combinations and to attempt to validate our analytic approach, we next determined genotype-phenotype correlations in subjects in all the terminal nodes (table 8). We applied two widely used methods of detecting latent genomic instability—the mutagen-sensitivity assay and the comet assay (Hsu et al.1989; Wu et al. 1998a, 1998b; Schabath et al. 2003). We categorized the terminal nodes into three groups based on the case percentage in each node: low-risk group (case percentage <55%), medium-risk group (case percentage 55%–75%), and high-risk group (case percentage >75%). In smokers, there was a significant trend of higher numbers of bleomycin- and BPDE-induced chromosome breaks (by mutagen-sensitivity assay) and DNA damage (by comet assay) for individuals in higher-risk subgroups among cases. The P for the trend was .0348 for bleomycin-induced chromatid breaks, .0036 for BPDE-induced chromatid breaks, and .0397 for BPDE-induced DNA damage, as measured by the comet assay. There were no similar patterns in the controls among these subgroups, suggesting that suboptimal DRC in cases contributes to their increased cancer risk. Interestingly, among cases, never smokers exhibited higher bleomycin- and BPDE-induced chromatid breaks (P=.039 and .031, respectively) than ever smokers (data not shown), suggesting that genetic instability plays a more prominent role in predisposing never smokers to bladder cancer, whereas, in ever smokers, gene-smoking interaction plays a central role in bladder cancer etiology.

Table 8
Chromosome Breaks and DNA Damage in Terminal Nodes at Smoking Side


In this study, we have used a multigenic approach to systematically examine the associations between a panel of polymorphisms in DNA-repair and cell-cycle genes and bladder cancer risk. To the best of our knowledge, this is the most comprehensive multigenic study evaluating the largest number of SNPs. The most significant finding in this study is that combined analyses of multiple SNPs in the same or relevant pathways may reveal otherwise undetectable associations between individual SNPs with cancer risk. Furthermore, we have shown that high-risk subgroups selected by the analysis exhibited high levels of latent genetic instability. Our results support the concept that genetic polymorphisms can be used as cancer risk predictors; however, a single polymorphism may have little predictive value in risk assessment, but a more comprehensive pathway-based multigenic approach combining multiple polymorphisms gives more-precise delineation of risk groups and may suggest the future direction of association studies.

There have been many studies reporting associations between genetic polymorphisms and susceptibility to common, complex diseases, including cancer. Most of these studies have used a candidate-gene approach, investigating one or only a few selected polymorphisms at a time. Since a complex disease like cancer typically occurs through a multifactorial interplay between many genetic and environmental factors, the effect of each individual SNP is unlikely to be substantial. For example, two of the most intensively studied DNA-repair gene SNPs, XPD Asp312Asn and Lys751Gln, were hypothesized to modify lung cancer risk, since several phenotypic studies suggested that individuals with the variant Asn or Gln alleles exhibit lower DRC (Spitz et al. 2001; Stern et al. 2002). Yet, two recent meta-analyses of XPD SNPs in lung cancer from nine case-control studies found that the variant alleles confer only an ~20% increased lung cancer risk for either SNP (Hu et al. 2004; Benhamou and Sarasin 2005). Given that the cancer risk is, at most, modestly altered by individual variants and that most single studies have used relatively small sample sizes, it has been estimated that only about a quarter of previously reported associations were real positive associations with common diseases (Ioannidis et al. 2001; Lohmueller et al. 2003).

In our main-effect analysis, the XPD Asp312Asn polymorphism was one of only two DNA-repair SNPs that exhibited a statistically significant association with bladder cancer risk (OR=1.28, 95% CI 1.01–1.62). This association is biologically plausible, since the Asn allele is associated with reduced DRC (Ioannidis et al. 2001), and a significant association of the Asn allele with lung cancer risk has been observed in a meta-analysis of lung cancer (Hu et al. 2004; Benhamou and Sarasin 2005). However, considering the borderline CI (95% CI 1.01–1.62) and the multiple comparison issue, we cannot rule out that this individual association was due to type I error. Moreover, the modest effects of individual variants would have limited practical value in predicting cancer risk, highlighting the need for a more comprehensive approach for association studies.

Our data provide strong evidence for the improvement of a multigenic approach over the single candidate-gene approach. We found a significant trend of increased risk with increasing numbers of adverse alleles in the NER pathway, in the entire DNA-repair pathways, and in the DNA-repair and cell-cycle pathways combined. The most significant additive effect of multiple polymorphisms in a single pathway was observed in the NER pathway. Individuals with the highest number of adverse alleles exhibited a 2.50-fold increased bladder cancer risk (95% CI 1.69–3.70) compared with individuals carrying fewer than four high-risk alleles. The difference in the combined effects of multiple polymorphisms among each individual pathway may reflect the differences in the contribution of each DNA-repair pathway to bladder cancer risk. For instance, the NER pathway mainly removes bulky DNA adducts, which are typically generated from exposure to polycyclic aromatic hydrocarbons in tobacco smoke. Therefore, the NER pathway would be expected to play a more significant role in repairing tobacco carcinogen–induced DNA damage, whereas the other DNA-repair pathways play a less prominent role. Our CART analysis confirmed and strengthened this conclusion by showing that NER gene polymorphisms were selected only in smokers, whereas a few general maintenance-gene polymorphisms were relevant in nonsmokers. More interestingly, subsets of individuals with higher cancer risks were identified through CART analysis based on simple combinations of smoking and genotypes. The simple, intuitive nature of the CART algorithm may allow rapid identification of potential genetic and environmental interactions when dealing with large numbers of variables in complex diseases. The number of possible interactions grows exponentially as each additional variable is added. Another strength of this study is the ability to evaluate changes in levels of induced genetic damage assessed by three in vitro assays among subjects in terminal nodes. However, the CART analysis is a postdata-mining tool and the results should be interpreted with some degree of caution. External validation in an independent epidemiology study is warranted to confirm the potential high-risk groups.

Although our primary analysis treated the minor-variant allele at each locus as the risk allele, this assumption may not be accurate in the absence of knowledge of the functional consequences of the variants under study. We therefore evaluated other approaches to the designation of “at-risk” alleles. For example, if the main effects for the minor-variant allele exhibited an OR <1 in the initial multivariate analyses, we assigned the major allele as the adverse allele. We observed similarly significant trends of increasing risk with increasing numbers of high-risk alleles, and all associations were stronger (data not shown), which was not surprising just by the way the variable was defined. We also performed the data analysis by reversing the minor allele when OR <1 and P<.05 or P<.1, and the results were similar to what we presented in tables 5 and and6,6, which was also expected since these criteria affected only a single polymorphism (p53 intron 3) (data not shown). Choosing the at-risk allele on the basis of the data set under analysis may lead to data-driven findings. There is currently not a standard way to assign at-risk alleles, and more studies are needed to develop an optimal way to assign at-risk alleles.

Matullo et al. (2003) previously found a dose-response relationship between the number of adverse alleles in DNA-repair gene SNPs and levels of DNA adducts in peripheral blood cells, suggesting a stepwise decrease in DRC as the number of adverse alleles increases and lending biological support for analyzing combined effects of multiple-variant alleles rather than investigating single SNPs in modulating cancer risk. Our phenotypic assays also demonstrated that higher-risk subgroups, as identified from the CART analysis, exhibited higher chromosome breaks and DNA damage (table 8), thereby providing additional biological plausibility and validity to our approach.

Although we tried to be as inclusive as possible in our selection of genes and SNPs, the selection criteria was based on potential functional SNPs in genes with higher possibilities of being related to cancer risk. A more comprehensive pathway-based approach—for example, selecting tagging SNPs in most key genes in a defined pathway—would offer more convincing evidence for the relevance of the evaluated pathway in cancer risk. Nevertheless, this is the most comprehensive study to date to use a multigenic approach to analyze multiple genetic polymorphisms in DNA-repair and cell-cycle genes in bladder cancer risk. Our results suggest that individuals with higher numbers of variants in DNA-repair genes are at an increased risk for bladder cancer, verifying the importance of taking a pathway-based approach to improve the resolution of the risk-assessment process. This study not only is important for bladder cancer risk assessment but is also applicable to risk assessment of many complex diseases incorporating low-penetrance genes. These data confirm the notion that the future of risk assessment for multigenic complex diseases needs to move beyond analysis of single polymorphisms.


The work reported here was supported by National Institutes of Health grants CA 74880 and CA 91846.

Web Resources

The URL for data presented herein is as follows:


Benhamou S, Sarasin A (2005) ERCC2 /XPD gene polymorphisms and lung cancer: a HuGE review. Am J Epidemiol 161:1–14 [PubMed] [Cross Ref]10.1093/aje/kwi018
Berwick M, Vineis P (2000) Markers of DNA repair and susceptibility to cancer in humans: an epidemiologic review. J Natl Cancer Inst 92:874–897 [PubMed] [Cross Ref]10.1093/jnci/92.11.874
Cheng TC, Chen ST, Huang CS, Fu YP, Yu JC, Cheng CW, Wu PE, Shen CY (2005) Breast cancer risk associated with genotype polymorphism of the catechol estrogen-metabolizing genes: a multigenic study on cancer susceptibility. Int J Cancer 113:345–353 [PubMed] [Cross Ref]10.1002/ijc.20630
Christmann M, Tomicic MT, Roos WP, Kaina B (2003) Mechanisms of human DNA repair: an update. Toxicology 193:3–34 [PubMed] [Cross Ref]10.1016/S0300-483X(03)00287-7
Elledge SJ (1996) Cell cycle checkpoints: preventing an identity crisis. Science 274:1664–1672 [PubMed] [Cross Ref]10.1126/science.274.5293.1664
Goode EL, Ulrich CM, Potter JD (2002) Polymorphisms in DNA repair genes and associations with cancer risk. Cancer Epidemiol Biomarkers Prev 11:1513–1530 [PubMed]
Gu J, Zhao H, Dinney CP, Zhu Y, Leibovici D, Bermejo CE, Grossman HB, Wu X (2005) Nucleotide excision repair gene polymorphisms and recurrence after treatment for superficial bladder cancer. Clin Cancer Res 11:1408–1415 [PubMed] [Cross Ref]10.1158/1078-0432.CCR-04-1101
Han J, Colditz GA, Samson LD, Hunter DJ (2004) Polymorphisms in DNA double-strand break repair genes and skin cancer risk. Cancer Res 64:3009–3013 [PubMed] [Cross Ref]10.1158/0008-5472.CAN-04-0246
Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100:57–70 [PubMed] [Cross Ref]10.1016/S0092-8674(00)81683-9
Horne BD, Anderson JL, Carlquist JF, Muhlestein JB, Renlund DG, Bair TL, Pearson RR, Camp NJ (2005) Generating genetic risk scores from intermediate phenotypes for use in association studies of clinically significant endpoints. Ann Hum Genet 69:176–186 [PubMed]
Hsu TC, Johnston DA, Cherry LM, Ramkissoon D, Schantz SP, Jessup JM, Winn RJ, Shirley L, Furlong C (1989) Sensitivity to genotoxic effects of bleomycin in humans: possible relationship to environmental carcinogenesis. Int J Cancer 43:403–409 [PubMed]
Hu Z, Wei Q, Wang X, Shen H (2004) DNA repair gene XPD polymorphism and lung cancer risk: a meta-analysis. Lung Cancer 46:1–10 [PubMed] [Cross Ref]10.1016/j.lungcan.2004.03.016
Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG (2001) Replication validity of genetic association studies. Nat Genet 29:306–309 [PubMed] [Cross Ref]10.1038/ng749
Jemal A, Murray T, Ward E, Samuels A, Tiwari RC, Ghafoor A, Feuer EJ, Thun MJ (2005) Cancer statistics, 2005. CA Cancer J Clin 55:10–30 [PubMed]
Kuschel B, Auranen A, McBride S, Novik KL, Antoniou A, Lipscombe JM, Day NE, Easton DF, Ponder BA, Pharoah PD, Dunning A (2002) Variants in DNA double-strand break repair genes and breast cancer susceptibility. Hum Mol Genet 11:1399–1407 [PubMed] [Cross Ref]10.1093/hmg/11.12.1399
Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN (2003) Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet 33:177–182 [PubMed] [Cross Ref]10.1038/ng1071
Matullo G, Peluso M, Polidoro S, Guarrera S, Munnia A, Krogh V, Masala G, Berrino F, Panico S, Tumino R, Vineis P, Palli D (2003) Combination of DNA repair gene single nucleotide polymorphisms and increased levels of DNA adducts in a population-based study. Cancer Epidemiol Biomarkers Prev 12:674–677 [PubMed]
Mohrenweiser HW, Wilson DM 3rd, Jones IM (2003) Challenges and complexities in estimating both the functional impact and the disease risk associated with the extensive genetic variation in human DNA repair genes. Mutat Res 526:93–125 [PubMed]
Neumann AS, Sturgis EM, Wei Q (2005) Nucleotide excision repair as a marker for susceptibility to tobacco-related cancers: a review of molecular epidemiological studies. Mol Carcinog 42:65–92 [PubMed] [Cross Ref]10.1002/mc.20069
Popanda O, Schattenberg T, Phong CT, Butkiewicz D, Risch A, Edler L, Kayser K, Dienemann H, Schulz V, Drings P, Bartsch H, Schmezer P (2004) Specific combinations of DNA repair gene variants and increased risk for non-small cell lung cancer. Carcinogenesis 25:2433–2441 [PubMed] [Cross Ref]10.1093/carcin/bgh264
Schabath MB, Spitz MR, Grossman, HB, Zhang K, Dinney CP, Zheng PJ, Wu X (2003) Genetic instability in bladder cancer assessed by the comet assay. J Natl Cancer Inst 95:540–547 [PubMed]
Shields PG, Harris CC (2000) Cancer risk and low-penetrance susceptibility genes in gene-environment interactions. J Clin Oncol 18:2309–2315 [PubMed]
Spitz MR, Wei Q, Dong Q, Amos CI, Wu X (2003) Genetic susceptibility to lung cancer: the role of DNA damage and repair. Cancer Epidemiol Biomarkers Prev 12:689–698 [PubMed]
Spitz MR, Wu X, Wang Y, Wang LE, Shete S, Amos CI, Guo Z, Lei L, Mohrenweiser H, Wei Q (2001) Modulation of nucleotide excision repair capacity by XPD polymorphisms in lung cancer patients. Cancer Res 61:1354–1357 [PubMed]
Stern MC, Johnson LR, Bell DA, Taylor JA (2002) XPD codon 751 polymorphism, metabolism genes, smoking, and bladder cancer risk. Cancer Epidemiol Biomarkers Prev 11:1004–1011 [PubMed]
Wu X, Gu J, Amos CI, Jiang H, Hong WK, Spitz MR (1998a) A parallel study of in vitro sensitivity to benzo[a]pyrene diol epoxide and bleomycin in lung cancer cases and controls. Cancer 83:1118–1127 [PubMed] [Cross Ref]10.1002/(SICI)1097-0142(19980915)83:6<1118::AID-CNCR10>3.0.CO;2-8
Wu X, Gu J, Hong WK, Lee JJ, Amos CI, Jiang H, Winn RJ, Fu KK, Cooper J, Spitz MR (1998b) Benzo[a]pyrene diol epoxide and bleomycin sensitivity and susceptibility to cancer of upper aerodigestive tract. J Natl Cancer Inst 90:1393–1399 [PubMed] [Cross Ref]10.1093/jnci/90.18.1393
Wu X, Zhao H, Amos CI, Shete S, Makan N, Hong WK, Kadlubar FF, Spitz MR (2002) p53 genotypes and haplotypes associated with lung cancer susceptibility and ethnicity. J Natl Cancer Inst 94:681–690 [PubMed]
Wu X, Zhao H, Suk R, Christiani DC (2004) Genetic susceptibility to tobacco-related cancer. Oncogene 23:6500–6523 [PubMed] [Cross Ref]10.1038/sj.onc.1207811
Wu X, Zhao H, Wei Q, Amos CI, Zhang K, Guo Z, Qiao Y, Hong WK, Spitz MR (2003) XPA polymorphism associated with reduced lung cancer risk and a modulating effect on nucleotide excision repair capacity. Carcinogenesis 24:505–509 [PubMed] [Cross Ref]10.1093/carcin/24.3.505

Articles from American Journal of Human Genetics are provided here courtesy of American Society of Human Genetics
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Cited in Books
    Cited in Books
    PubMed Central articles cited in books
  • Gene
    Gene links
  • GEO Profiles
    GEO Profiles
    Related GEO records
  • HomoloGene
    HomoloGene links
  • MedGen
    Related information in MedGen
  • OMIM
    OMIM record citing PubMed
  • PubMed
    PubMed citations for these articles
  • SNP
    PMC to SNP links