A specific spectrum of p53 mutations in lung cancer from smokers: review of mutations compiled in the IARC p53 database.

Mutations in the p53 gene are common in lung cancer. Using data from the the International Agency for Research on Cancer p53 mutation database (R1), we have analyzed the distribution and nature of p53 mutations in 876 lung tumors described in the literature. These analyses confirm that G to T transitions are the predominant type of p53 mutation in lung cancer from smokers. The most frequently mutated codons include 157, 158, 179, 248, 249, and 273, and several of them (157, 248, and 273) have been shown to correspond to sites of in vitro DNA adduct formation by metabolites of polycyclic aromatic hydrocarbons (PAHs) such as benzo(a)pyrene. Furthermore, most of the base changes at codons 248, 249, and 273 in lung cancer differ from those commonly observed at these codons in other cancers reported in the database. Thus, lung cancer from smokers shows a distinct, unique p53 mutation spectrum that is not observed in lung cancer from nonsmokers. These results further strengthen the association between active smoking, exposure to PAHs, and lung cancer. They also indicate that a different pattern of mutations occurs in nonsmokers, and this observation may help to identify other agents causally involved in lung cancer in nonsmokers.

exposure to PAHs, and lung cancer. They aso indicate that a de pattern of mutations occurs in nonsmokers, and this obsevation may help to identify other agents causally involved in lung.cancer in nonsmokers. Key wonr. benz(a)pyrene, lung cancer, nonsmokers, p53 mutations, tobac'co. Environ HealthPpce106:385-391 (1998). [Oline 10june 1998] hatp://ehpnetl.nies.nih.gov/ldosa998/106p385-391h~aand&/abstract.html Lung cancer is the leading cause of death in developed countries and is considered as one of the most common cancers worldwide (1)(2)(3). Tobacco smoking has been identified as a major risk factor for the development of this cancer (4)(5)(6). Overall, recent cohort studies show that the risk of death from lung cancer in smokers of two or more packs of cigarettes per day is about 20 times that of nonsmokers (5,6). Tobacco smoke is a complex mixture that contains about 3,800 different potentially harmful chemicals (7). Several of these chemicals are proven carcinogens and occur at significant concentrations in tobacco smoke. These chemicals include benzo(a)pyrene [BaP, a polycyclic aromatic hydrocarbon (PAH)] at 20-40 ng/cigarette, N-nitroso compounds at up to 200-3,000 ng/cigarette, 4-aminobiphenyl (an aromatic amine) at 2.4-4.6 ng/cigarette, and vinyl chloride at 1.3-16 ng/cigarette (8,9). The exact contribution of each of these various carcinogens to lung cancer induced by tobacco smoke is poorly understood.
Deletion and point mutations in the p53 tumor suppressor gene are common in most types of human cancers, including lung cancer. Missense mutations occur at about 300 distinct positions within the p53 coding sequence. The diversity of positions and chemical natures of these mutations allows the determination of tumor-specific mutation spectra that can provide clues to the nature of the mutagenic agents which are involved as causative agents (10)(11)(12). To facilitate the analysis and interpretation of these mutations, a databjse of p53 mutations in human tumors and cell lines is maintained at the International Agency of Research on Cancer (IARC). This database, initiated in 1991 by Hollstein et al. (13), is exclusively based on published material and contains about 8,000 somatic mutations in an electronic format (14).
To determine whether the spectrum of p53 mutations may aid in understanding the role of carcinogens associated with tobacco smoke, we have carried out detailed analyses of the mutations associated with lung cancer compiled in the IARC p53 mutation database (849 cases). We have reviewed these data in the light of recent progress in our understanding of the mechanisms of 1) selective adduct formation in the p53 coding sequence, 2) strandspecific and position-specific DNA repair, and 3) selection of mutant proteins with specific functional properties. Our analyses confirm and extend previous reports that G to T transversions are specifically found in tumors from smokers (11). Moreover, we have retrieved data on p53 mutations in 36 nonsmokers; analysis of these mutations shows a unique spectrum of mutations, different from lung cancer from smokers as well as from all other cancers.

Methods
Point mutations in the p53 gene of human tumors and cell lines were extracted from the IARC p53 mutation database. This database is updated twice a year, and for this analysis we used the January 1998 update (R1; 8,000 mutations). The database exists in different electronic formats available on the World Wide Web (http.//www.iarc.frI/p53/homepage.htm) and is deposited at the European Biolnformatic Institute (EBI; http://www.ebi.ac.uk). For analysis of the database, we developed a program using FileMaker Pro 3.0 (Claris Corporation, Santa Clara, CA) that is described on our database web site and published elsewhere (14). The database contains information on 876 mutations in lung tumors. These mutations were compared with those found in breast tumors (729 cases) and colon tumors (900 cases) because they occur at high frequencies in the general population, they frequently contain p53 mutations [approximately 50% in colon cancer (11), and 15-40% in breast cancer (15)], and they are well represented in the LARC p53 database.
The database is based on published records and does not contain information on tumors without p53 mutations. When information was provided on the smoking status of individual patients with lung cancer, the tumors were classified into two groups: ever smoked (236 cases) and never smoked (36 cases). Information on sex was available for 13% of the cases (81 males and 32 females). For the classification of the different lung cancer pathologies, we used the terminology given in each individual paper. The classification of the 876 lung tumors, as well as the availability of information on smoking, is given in Table 1. The X2 test was used for statistical analyses. When expected values in the x2 test were less than 5, Fisher's exact test was used.

Results
High frequency of G to T transversions. In lung cancer, p53 missense mutations are detected in about 60% of the tumors, and about one-third of these mutations have been reported as G to T transversions (11). Figure  1 compares the spectrum of mutations in lung cancer with all other cancer types. Lung cancer shows a significantly higher proportion of G to T transversions, 33% in lung and 13% in other cancers (X 2 = 228.74; p<O.OOl), and a lower proportion of C to T transitions at CpG dinucleotides (10% and 26%, respectively x2 = 113.10; p<0.001). Of all the cancers listed in the IARC p53 database, lung cancer shows the lowest frequency of C to T transitions at CpG dinucleotides; in breast and colon cancers, these transitions represent 21% and 45%, respectively. As C to T transitions at CpG dinucleotides are thought to result from methylation and spontaneous deamination of cytosine, they are commonly considered as a representative of the proportion of mutations due to endogenous mechanisms (16).
Of the 287 G to T transversions found in lung cancer, 93% occur on the coding, nontranscribed strand, a percentage higher than initially reported by Greenblatt et al. (11). In breast and colon cancers, this strand bias is less marked, with only 80% and 77%, respectively, of G to T transversions on the nontranscribed strand. In contrast with G to T transversions, most other types of mutations are more equally distributed on both strands in lung cancer, with the exception of A to G transitions, of which 88% occur on the noncoding strand. The significance of these strand biases is not clearly understood. In agreement with the concept of transcription-coupled repair, it has been postulated that the nontranscribed strand is repaired less efficiently than the transcribed strand. However, it cannot be ruled out that this phenomenon reflects the distribution of mutable sites over the two strands (17). In experimental systems, Palombo et al. (18) and McGregor et al. (19) have shown that GC to AT transitions induced by alkylating agents were preferentially located on the nontranscribed strand, independently of the transcriptional activity of the gene. Recently it has been shown that adducts formed by BaP on the nontranscribed strand of the p53 gene also show slower repair kinetics than those formed on the transcribed strand (20,21). Repair efficiency could thus contribute strongly to the p53 mutation spectra in lung cancer. The high prevalence of G to T transversions, as well as the strand bias in the distribution of these mutations, is highly suggestive of the involvement of exogenous carcinogens as direct mutagens (11,22). In experimental systems and in several cancers other than lung, G to T transversions have been shown to result from the mutagenic effect of several distinct classes of agents including PAHs, aromatic amines, mycotoxins such as aflatoxin B,, ionizing radiation, and oxidants (11,23,24). G to T transversions are also minor mutations caused by some Nnitroso compounds such as 4-(methylnitrosamino)-1-(3-pyridyl)-1 -butanone (NNK) (25), which cause predominantly G to A transitions. With the exception of aflatoxin BI and ionizing radiation, most of these classes of agents are present at significant levels in tobacco smoke.
Differences between histological types of lung cancers. The various histological types of lung cancer differ by the spectrum ofp53 mutations. Although all types have a higher than average proportion of G to T transversions, this proportion is significantly lower in adenocarcinomas (ADC; 27%) than in squamous cell carcinomas (SCC; 33%) or in small cell lung carcinomas and large cell carcinomas (SCLC and LLC, respectively; 41%) [ Fig. 2; in this figure we have not taken into account the tumors identified in the database as non-small cell lung carcinomas (nSCLC), as they may include ADC and SCC]. Interestingly, SCLC and LCC also show a low proportion of transitions at CpG sites compared with ADC and SCC. These percentages are somewhat different from those reported in previous smaller studies. Based on 15 ADC and 26 SCC, Kure et al. (26) reported frequencies of G to T transversions of 34% in ADC and 31% in SCC. In a series of 52 SCC and 43 ADC from Finland, Husgafvel-Pursiainen et al. (27) found 29% of G to T transversions in SCC and 46% in ADC. Both studies suggested that G to T transversions were more common in ADC than in SCC. Analysis of the IARC p53 mutation database, however, reveals a reverse trend (Fig. 2). ADC is considered to be less strongly associated with smoking than SCC or SCLC, and ADC has a higher prevalence than SCC or SCLC among nonsmokers (28,29).
p53 Mutations occur at binding sitesfor benzo(a)pyrene. The most frequently p53 mutated codons in lung cancer are 157, 158, 179, 248, 249, 273, and 282 ( Figure  3). Three of these codons (248, 249, and 282) are major mutational hot spots in most Volume 106, Number 7, July 1998 * Environmental Health Perspectives cancer types. In contrast, codon 175, which is frequently mutated in most cancers, is not a hot spot in lung cancers. Recently, Denissenko et al. (30,31) have shown that experimental exposure of Hela cells or bronchial epithelial cells in primary culture to BaP resulted in strong and selective adduct formation on guanines at p53 codons 157 (GTC), 248 (CGG), and 273 (CGT). Most of the mutations at codon 158 are also G to T on the nontranscribed strand. However, codon 179 does not contain a guanine on the nontranscribed strand, and the most frequent mutation at this codon is an A to G transition at the second position. This position does not correspond to a potential 8   Environmental Health Perspectives * Volume 106, Number 7, July 1998 between lung cancer and colon or breast cancers. However, examination of the type of mutations in lung cancer at these common hot spots shows a unique mutation spectrum unlike that found in other cancers (Figs. 4 and 5). In breast and colon cancers, codons 248 and 273 are almost exclusively mutated by C to T transitions  4). Taken together, these observations and those in Figure 2 indicate that G to T transversions in lung cancer result from the selective occurrence of BaP diol epoxide (BPDE)-DNA adducts at specific G positions within CpG sites. It is believed that this selective occurrence is favored by the presence of 5-methylcytosine adjacent to the target G position (31). Codon 249 also demonstrates a distinct mutation profile in lung cancer. Codon 249 (AGG, arginine) is not a CpG site, and 90% of all p53 mutations at this codon are  G to T transversions. Mutation of the third G position (AGG to AGT, arginine to serine) is considered as a specific fingerprint of aflatoxin Bi in hepatocellular carcinomas from areas around the world, where this mycotoxin is a major food contaminant (32). However, in lung cancer, over 45% of the G to T transversions occur at the second G position (AGG to ATG, arginine to methionine). This particular mutation is very specific to lung cancer and rarely occurs in colon and breast cancers (X 2 =15.23; p<0.001) (Fig. 5). Experimentally, codon 249 does not appear to be a major target of BPDE-DNA adduct formation (3Q). This observation suggests that the second G position of codon 249 may be a specific target for a carcinogen from tobacco smoke other than BaP. Furthermore, about one-third of the codon 249 mutations in lung cancer originate in uranium miners (33) and were reported as a radon hot spot.
To date, this hot spot has not been confirmed by two other studies of uranium workers (34,35); therefore, the etiology of mutations at codon 249 in lung cancer is still unclear. p53 Mutations in nonsmokers. In a study of 53 lung cancer patients, Kondo et al. (1) reported a statistically significant dose relationship between the quantity of cigarettes consumed and the frequency of p53 mutations as determined by RT-PCR-SSCP (reverse-transcription-polymerase chain reaction single strand confirmation polymorphism). In another study, Husgafvel-Pursiainen and Kannio (36) found a mutation frequency of 28% in nonsmokers, 38% in ex-smokers, and 56% in current smokers. However, the exact type of mutation in these patients has not been determined by sequencing.
Information on smoking status is available for 236 lung cancer patients in the p53 mutation database, of whom 36 are nonsmokers. In nonsmokers, mutations occur at different codons than in smokers. In particular, the major sites of BPDE-DNA adduct formation in vitro (codons 157, 248, and 273) are not frequently mutated in nonsmokers. However, the most frequently mutated codons in nonsmokers (codons 179 and 249) are also hot spots in smokers. Figure 6 compares the mutation spectrum in smokers and nonsmokers. Nonsmokers significantly differ from smokers by the lower proportion of G to T transversions (X2 =7.83, p<0.002) and the higher proportions G to C transversions (X2=15.08; p<0.001) and G to A transitions at CpG dinudeotides (although not statistically significant). Therefore, both the spectrum and codon distribution of p53 mutations in nonsmokers do not correspond to the BaP fingerprint Volume 106, Number 7, July 1998 * Environmental Health Perspectives 0 identified in smokers. Furthermore, the proportion of G to A transitions at non-CpG sites is equally high among the two dasses. In addition, there is a much higher proportion of G to C transversions. Together, these observations are indicative of the possible involvement of exogenous mutagens. Little information is available on individual risk factors in connection with the 36 nonsmoking lung cancer patients induded in the database. However, it is noteworthy to mention that 17 of these nonsmokers are ofAsian origin, induding 13 Japanese (6 of them atomic-bomb survivors) and 4 Chinese, with over 48% of the nonsmokers falling in exon 5 of the p53 gene (37). Guinee et al. (38) and Kure et al. (26) have reported a higher frequency of G to T transversions in women than in men; both of these observations were based on a small number of cases and were not significantly different. In the IARC p53 mutation database, information exists on 81 men and 32 women with p53 mutations in lung tumors. Figure 7 shows that in our analyses we also find that women have a higher frequency of G to T mutations and men a higher frequency of G to A mutations at non-CpG sites, although these spectra are not significantly different. This coincides with the evidence that women are more susceptible to DNA damage from cigarette smoke and other carcinogens (39). However, the numbers analyzed are still relatively small, and larger study groups are needed to draw any definite conclusions of p53 mutations in men and women.

Conclusions
Carcinogens damage DNA in specific ways, and in various circumstances their fingerprints can be found in the mutations found in DNA of cancer patients. DNA repair and bioselection of mutants with particular properties can also contribute to the final spectrum of mutations observed in any particular type ofcancer. The extent of DNA damage is influenced by genetic susceptibility factors, induding the capacity to metabolize carcinogens in exposed individuals (40). The analysis ofdata on p53 mutations in lung cancer from the IARC p53 database allows a better definition of the contribution of these three types of factors to the mutation spectrum in tumors associated with tobacco consumption.
This analysis confirms that G to T transversions are frequent in lung cancer from smokers, but not in nonsmokers. This suggests that G to T transversions are a molecular signature of mutagenesis by tobacco smoke.
Although many carcinogens from tobacco smoke may induce such mutations, the evidence suggests that one major class of causative agents is PAHs, in particular BaP. First, G to T transversions are consistently found in a number of experimental model systems after exposure to either tobacco smoke or BaP, such as in Salmoned, where 80% of the mutations induced by cigarette smoke are G to T transversions (41), in human, hamster, and mouse diploid cells (42,43), and in mouse skin (22). Second, many of these mutations occur at bases in codons 157, 248, and 273, which are strong and specific sites for adduct formation by metabolites of BaP in cultured cells (25,30). These codons contain CpG repeats. Recent evidence indicates the presence of 5-methylcytosine in CpG sequences may strongly enhance the formation of BPDE-DNA adducts at the G position (31).
Upon absorption by target cells, BaP requires metabolic activation by CYP2E1 to generate compounds that form promutagenic DNA adducts. The most significant metabolite is BPDE which preferentially binds covalendy on the N2 position of guanine (44). Promutagenic BPDE-DNA adducts are detectable in nontumorous and tumorous lung tissue from smokers (7,26,. Furthermore, an association between 35 30 0 E 20 X 15 L 10 C to T at CpG* C to T at nonCpG G to T* G to C Ato C Ato G AtoT the genotype deficient in glutathione S-transferase MI enzyme (GSTM1)-mediated detoxification and presence of G to T transversions has been observed in lung cancer patients (39). In persons who lack the GSTMI gene, activation of BaP appears to be increased and the efficacy of detoxification is limited (40). Together, these data strongly suggest that BaP is implicated as a causative agent in human lung carcinogenesis. The mutation spectrum of p53 in lung cancer from smokers is distinct from that of any other cancer by both the nature and distribution of mutations. Even at codons that are common hot spots, such as 248, 249, and 273, the mutations in lung cancer differ from those in other cancers. In most cancers, cytosines in CpG sites at codons 248 and 273 are frequently mutated through spontaneous methylation and deamination. In contrast, in lung cancer these codons often show mutations at the G position adjacent to the methylated cytosine. This results in amino acid substitutions that are very specific to lung cancer, for example, arginine to leucine at codons 248 and 273. However, G to T mutations also occur at positions that are not in CpG sites, such as at codon 157 (nudeotide position 469) and 249. For both of these codons, the mutations observed in lung cancer also differ from other cancers. Codon 157 is rarely mutated in other cancers and may represent a specific hot spot associated with tobacco-induced lung cancers. Codon 249 is a well-defined target for mutations induced by aflatoxin B1 in hepatocellular carcinoma (HCC) (32. In HCC, mutations occur almost exclusively at the third base (AGG to AGT), whereas in lung cancer the mutations occur essentially on the second base (AGG to ATG).
These observations further substantiate the experimental data of Denissenko et al. in bronchial cells (30,31). In addition, Cherpillod and Amstad (48) showed BaPinduced G to T transversions at the middle position of codon 248 in an HCC cell line. These data show the establishment of a clear molecular link between a carcinogen present in tobacco and specific p53 mutations in lung cancer. However, this does not rule out that other carcinogens from tobacco smoke may also play a role as p53 mutagens, in particular, at codons such as 179 and 249, which are not demonstrated sites of strong adduct formation by BPDE in vitro (30,35). Furthermore, several agents present in tobacco smoke may increase the mutagenic effect of BaP by enhancing the formation of BPDE-DNA adducts (31) or by altering the efficiency of DNA repair mechanisms.
Selective targeting alone does not account for the complexity of the mutation spectrum in lung cancer. Indeed, G to T transversions in lung cancer occur almost exclusively on the nontranscribed strand (94%). This may result from a strand-specific difference in the efficiency of nudeotide excision repair, in agreement with the hypothesis of transcriptioncoupled repair. However, experimental demonstration of the role of transcription in this process is still lacking.
Bioselection of mutants with particular properties may also play a role. Detailed analysis of mutations at codon 175 provides an example for such functional selection. Codon 175 contains a CpG site and is frequently mutated in all cancers except lung cancer. Experimental evidence reveals that G to T mutations at this codon (CGC to CTC, arginine to leucine) result in a mutant protein that retains wild-type properties, such as the capacity to bind DNA and to transactivate p53-dependent reporter genes (4p). Thus, absence of mutations at codon 175 in lung cancer may reflect a negative selection. In contrast, the high frequency of mutations at codons 157 and 158 may reflect a positive selection mechanism. These corresponding residues are not part of the p53-DNA contact surface and are not confirmed p53 mutational hot spots in other cancers. More mechanistic studies are needed to determine whether these mutations have specific functional properties that may explain their selection.
Previously published information on p53 mutations in nonsmokers is limited. By compiling published information on p53 mutations in nonsmokers, we present a mutation spectra including 36 individual cases. These data show that the mutation spectrum in nonsmokers differs from those in smokers. Moreover, the nature of mutations in nonsmokers is consistent with mutational mechanisms involving exogenous carcinogens. These results should be interpreted cautiously, as confounding factors cannot be ruled out due to the limited number of cases. Moreover, half of the nonsmokers are of Asian decent, and this population may have a different genetic susceptibility than the Western population.
One critical question is whether the type of lung cancer in nonsmokers influences the observed spectrum. Over 50% of the nonsmokers in this review are ADC patients; however, comparison between smokers and nonsmokers within ADC patients reveals the same differences as shown in Figure 6.
In a recent paper, Gao et al. (5Q) analyzed p53 mutations in Chinese lung cancer patients from 10 smokers and 17 nonsmokers. They found a total of 107 mutations, with 59 in the 17 nonsmokers. Individual tumors were found to contain up to 14 mutations, an unusual phenomenon that has no equivalent in the IARC p53 mutation database. On the basis of their results, the authors concluded that there is no difference in the frequency of G to T transversions between smokers and nonsmokers. We consider that these data should be taken cautiously, as they may reflect either a very peculiar population or be the result of laboratory artifacts.
The molecular basis of the mutation spectrum in nonsmokers is unknown. Epidemiological studies have pointed to roles of passive smoking, environmental radiation, and occupational exposure to metals (29,51,52). Further studies are required to indicate whether p53 mutations in nonsmokers carry specific fingerprints of any of these factors. However, our observations cannot be interpreted as ruling out the role of passive smoking in the genesis of this cancer. Indeed, tobacco carcinogens other than BaP may play a significant role in tumors associated with passive smoking. In this respect, cancers other than lung cancer, which are associated with tobacco consumption (such as oral cavity and bladder and esophagus cancers in Western countries), do not show as high a frequency of G to T transversions in p53 as seen in lung cancer (11).
In summary, these analyses show that p53 mutations in lung cancer from smokers carry highly significant fingerprints of exposure to tobacco smoke components, in particular BaP. These fingerprints are not found in nonsmokers, which therefore leaves open the possibility that distinct tobacco or environmental carcinogens may also play an important role in the etiopathology of lung cancer.