NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Griffiths AJF, Gelbart WM, Miller JH, et al. Modern Genetic Analysis. New York: W. H. Freeman; 1999.

  • By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.
Cover of Modern Genetic Analysis

Modern Genetic Analysis.

Show details

The Molecular Basis of Mutation

In gene mutation, one allele of a gene changes into a different allele. Because such a change takes place within a single gene and maps to one chromosomal locus (“point”), a gene mutation is sometimes called a point mutation. This terminology originated before the advent of DNA sequencing and therefore before it was routinely possible to discover the molecular basis for a mutational event. Nowadays, point mutations typically refer to alterations of single base pairs of DNA or of a small number of adjacent base pairs. In this chapter, we focus on such simple point mutations.

The constellation of possible ways in which point mutations could change a wild-type gene is very large and varies according to the particular structure and sequence of the gene. However, it is always true that mutations that reduce or eliminate gene function (loss-of-function mutations) are the most abundant class. The reason is simple: it is much easier to break a machine than to alter the way that it works by randomly changing or removing one of its components. For the same reason, mutations that increase or alter the type of activity of the gene or where it is expressed (gain-of-function mutations) are much rarer.

Changes at the DNA Level

Point mutations are classified in molecular terms in Table 7-1, which shows the main types of DNA changes and their functional effects at the protein level.

Table 7-1. Gene Mutations at the Molecular Level.

Table 7-1

Gene Mutations at the Molecular Level.

At the DNA level, there are two main types of point mutational changes: base substitutions and base additions or deletions. Base substitutions are those mutations in which one base pair is replaced by another. Base substitutions again can be divided into two subtypes: transitions and transversions. To describe these subtypes, we consider how a mutation alters the sequence on one DNA strand (the complementary change will take place on the other strand.) A transition is the replacement of a base by the other base of the same chemical category (purine replaced by purine: either A to G or G to A; pyrimidine replaced by pyrimidine: either C to T or T to C). A transversion is the opposite—the replacement of a base of one chemical category by a base of the other (pyrimidine replaced by purine: C to A, C to G, T to A, T to G; purine replaced by pyrimidine: A to C, A to T, G to C, G to T). In describing the same changes at the double-stranded level of DNA, we must state both members of a base pair: an example of a transition would be G·C → A·T; that of a transversion would be G·C → T·A.

Addition or deletion mutations are actually of nucleotide pairs; nevertheless, the convention is to call them base-pair additions or deletions. The simplest of these mutations are single-base-pair additions or single-base-pair deletions. There are examples in which mutations arise through simultaneous addition or deletion of multiple base pairs at once. As we shall see later in this chapter, mechanisms that selectively produce certain kinds of multiple-base-pair additions or deletions are the cause of certain human genetic diseases.

What are the functional consequences of these different types of point mutations? First, consider what happens when a mutation arises in a polypeptidecoding part of a gene. For single-base substitutions, there are several possible outcomes, which are direct consequences of two aspects of the genetic code: degeneracy of the code and the existence of translation termination codons.


Silent substitutions: the mutation changes one codon for an amino acid into another codon for that same amino acid.


Missense mutations: the codon for one amino acid is replaced by a codon for another amino acid.


Nonsense mutations: the codon for one amino acid is replaced by a translation termination (stop) codon.

Silent substitutions never alter the amino acid sequence of the polypeptide chain. The severity of the effect of missense and nonsense mutations on the polypeptide will differ on a case-by-case basis. For example, if a missense mutation causes the substitution of a chemically similar amino acid, referred to as a synonymous substitution, then it is likely that the alteration will have a less-severe effect on the protein’s structure and function. Alternatively, chemically different amino acid substitutions, called nonsynonymous substitutions, are more likely to produce severe changes in protein structure and function. Nonsense mutations will lead to the premature termination of translation. Thus, they have a considerable effect on protein function. Typically, unless they occur very close to the 3′ end of the open reading frame, so that only a partly functional truncated polypeptide is produced, nonsense mutations will produce completely inactive protein products.

Like nonsense mutations, single-base additions or deletions have consequences on polypeptide sequence that extend far beyond the site of the mutation itself. Because the sequence of mRNA is “read” by the translational apparatus in groups of three base pairs (codons), the addition or deletion of a single base pair of DNA will change the reading frame starting from the location of the addition or deletion and extending through to the carboxy terminal of the protein. Hence, these lesions are called frameshift mutations. These mutations cause the entire amino acid sequence translationally downstream of the mutant site to bear no relation to the original amino acid sequence. Thus, frameshift mutations typically exhibit complete loss of normal protein structure and function.

Now let’s turn to those mutations that occur in regulatory and other noncoding sequences. Those parts of a gene that are not protein coding contain a variety of crucial functional sites. At the DNA level, there are sites to which specific transcription-regulating proteins must bind. At the RNA level, there are also important functional sequences such as the ribosome-binding sites of bacterial mRNAs and the self-ligating sites for intron excision in eukaryote mRNAs.

The ramifications of mutations in parts of a gene other that the polypeptide-coding segments are much harder to predict. In general, the functional consequences of any point mutation (substitution or addition or deletion) in such a region depend on its location and on whether it disrupts a functional site. Mutations that disrupt these sites have the potential to change the expression pattern of a gene in terms of the amount of product expressed at a certain time or in response to certain environmental cues or in certain tissues. We shall see numerous additional examples of such target sites as we explore mechanisms of gene regulation later on (Chapters 14–17). It is important to realize that such regulatory mutations will affect the amount of the protein product of a gene, but they will not alter the structure of the protein. Alternatively, some mutations might completely inactivate function (such as polymerase binding or intron excision) and be lethal.

It appears that genes also contain noncoding sequences that cannot be “point mutated” to produce detectable phenotypes. These sequences are interspersed with the mutable sites. These sequences are either functionally irrelevant or protected from mutational damage in some way.

New mutations are categorized as induced or spontaneous. Induced mutations are defined as those that arise after purposeful treatment with mutagens, environmental agents that are known to increase the rate of mutations. Spontaneous mutations are those that arise in the absence of known mutagen treatment. They account for the “background rate” of mutation and are presumably the ultimate source of natural genetic variation that is seen in populations.

The frequency at which spontaneous mutations occur is low, generally in the range of one cell in 105 to 108. Therefore, if a large number of mutants is required for genetic analysis, mutations must be induced. The induction of mutations is accomplished by treating cells with mutagens. The mutagens most commonly used are high-energy radiation or specific chemicals; examples of these mutagens and their efficacy are given in Table 7-2 on the following page. The greater the dose of mutagen, the greater the number of mutations induced, as shown in Figure 7-1. Note that Figure 7-1 shows a linear dose response, which is often observed in the induction of point mutations. The molecular mechanisms whereby mutagens act will be covered in subsequent sections.

Table 7-2. Forward Mutation Frequencies Obtained with Various Mutagens in Neurospora.

Table 7-2

Forward Mutation Frequencies Obtained with Various Mutagens in Neurospora.

Figure 7-1. Linear relation between X-ray dose to which Drosophila melanogaster were exposed and the percentage of mutations (mainly sex-linked recessive lethals).

Figure 7-1

Linear relation between X-ray dose to which Drosophila melanogaster were exposed and the percentage of mutations (mainly sex-linked recessive lethals).

Recognize that the distinction between induced and spontaneous is purely operational. If we are aware that an organism was mutagenized, then we infer that any mutations that arise after this mutagenesis were induced. However, this is not true in an absolute sense. The mechanisms that give rise to spontaneous mutations also are in action in this mutagenized organism. In reality, there will always be a subset of mutations recovered after mutagenesis that are independent of the action of the mutagen. The proportion of mutations that fall into this subset depends on how potent a mutagen is. The higher the rate of induced mutations, the lower the proportion of recovered mutations that are actually “spontaneous” in origin.

Induced and spontaneous mutations arise by generally different mechanisms, so they will be covered separately. After considering these mechanisms, we shall explore the subject of biological mutation repair. Without these repair mechanisms, the rate of mutation would be so high that cells would accumulate too many mutations to remain viable and capable of reproduction. Thus, the mutational events that do occur are those rare events that have somehow been overlooked or bypassed by the repair processes.

Mechanisms of Mutation Induction

When we examine the array of mutations induced by different mutagens, we see a distinct specificity that is characteristic of each mutagen. Such mutational specificity was first noted at the rII locus of the bacteriophage T4. Specificity arises from a given mutagen’s “preference” both for a certain type of mutation (for example, G·C → A·T transitions) and for certain mutational sites called hot spots.

Mutagens act through at least three different mechanisms. They can replace a base in the DNA, alter a base so that it specifically mispairs with another base, or damage a base so that it can no longer pair with any base under normal conditions.

Base replacement

Some chemical compounds are sufficiently similar to the normal nitrogen bases of DNA that they are occasionally incorporated into DNA in place of normal bases; such compounds are called base analogs. Many of these analogs have pairing properties unlike those of the normal bases; thus they can produce mutations by causing incorrect nucleotides to be inserted during replication. To understand the action of base analogs, we must first consider the natural tendency of bases to assume different forms.

All of the bases in DNA can exist in one of several forms, called tautomers, which are isomers that differ in the positions of their atoms and in the bonds between the atoms. The forms are in equilibrium. The keto form of each base is normally found in DNA (Figure 7-2), whereas the enol forms of the bases are rare. The complementary base pairing of the enols is different from that of the keto forms. Figure 7-3 demonstrates the possible mispairs resulting from tautomeric shifts. Because of such mispairing, the enols are one source of rare spontaneous mutations. For example, assume that a guanine in DNA changes into its enol form at the moment at which it is copied in the course of replication (it changes back into its keto form soon after). The enol form will bind to an incoming thymine. Hence we can represent the mutagenic process as follows, in which G* is the enol form of guanine:

Image ch7e1.jpg

Figure 7-2. Pairing between the normal (keto) forms of the bases.

Figure 7-2

Pairing between the normal (keto) forms of the bases.

Figure 7-3. Mismatched bases.

Figure 7-3

Mismatched bases. (a) Mispairs resulting from rare tautomeric forms of the pyrimidines; (b) mispairs resulting from rare tautomeric forms of the purines.

The appearance of the A·T pair represents a G·C → A·T base-pair transition. The same process in the double helix is diagrammed in Figure 7-4 on the following page.

Figure 7-4. Mutation by tautomeric shifts in the bases of DNA.

Figure 7-4

Mutation by tautomeric shifts in the bases of DNA. (a) In the example diagrammed, a guanine undergoes a tautomeric shift to its rare enol form (G*) at the time of replication. (b) In its enol form, it pairs with thymine. (c) and (d) In the next replication, (more...)

If, in the course of replication, an incoming free base (say, guanine) temporarily enolizes, it will pair with an in situ thymine in the DNA, and the consequence is an A·T → G·C transition as follows:

Image ch7e2.jpg

Mispairs can also occur when bases become spontaneously ionized. The mutagen 5-bromouracil (5-BU) is an analog of thymine that has bromine at the carbon-5 position in place of the CH3 group found in thymine. Its mutagenic action is based on enolization and ionization. In 5-BU, the bromine atom is not in a position in which it can hydrogen-bond during base pairing, so the keto form of 5-BU pairs with adenine, as would thymine, and this pairing is shown in Figure 7-5a. However, the presence of the bromine atom significantly alters the distribution of electrons in the base ring, so 5-BU can frequently change to either the enol form or an ionized form, the latter of which pairs with guanine (Figure 7-5b). 5-BU causes G·C → A·T or A·T → G·C transitions in the course of replication, depending on whether 5-BU has been enolized or ionized in situ or as an incoming base. Hence the action of 5-BU as a mutagen is due to the fact that the molecule spends more of its time in the enol or ion form.

Figure 7-5. Alternative pairing possibilities for 5-bromouracil (5-BU).

Figure 7-5

Alternative pairing possibilities for 5-bromouracil (5-BU). 5-BU is an analog of thymine that can be mistakenly incorporated into DNA as a base. It has a bromine atom in place of the methyl group. (a) In its normal keto state, 5-BU mimics the pairing behavior (more...)

Another base analog widely used in research is 2-aminopurine (2-AP), which is an analog of adenine that can pair with thymine but, when protonated, can also mispair with cytosine, as shown in Figure 7-6. Therefore, when 2-AP is incorporated into DNA by pairing with thymine, it can generate A·T → G·C transitions by mispairing with cytosine in subsequent replications. Or, if 2-AP is incorporated by mispairing with cytosine, then G·C → A·T transitions will result when it pairs with thymine. Genetic studies have shown that 2-AP, like 5-BU, is highly specific for transitions.

Figure 7-6. Alternative pairing possibilities for 2-aminopurine (2-AP), an analog of adenine.

Figure 7-6

Alternative pairing possibilities for 2-aminopurine (2-AP), an analog of adenine. Normally, 2-AP pairs with thymine (a), but, in its protonated state, it can pair with cytosine (b).

Base alteration

Some mutagens are not incorporated into the DNA but instead alter a base, causing specific mispairing. Certain alkylating agents, such as ethyl methanesulfonate (EMS) and the widely used nitrosoguanidine (NG), operate by this pathway:

Image ch7e3.jpg

Although such agents add alkyl groups (an ethyl group in the case of EMS and a methyl group in the case of NG) to many positions on all four bases, mutagenicity is best correlated with an addition to the oxygen at the 6 position of guanine to create an O-6-alkylguanine. This alkylation leads to direct mispairing with thymine, as shown in Figure 7-7, and results in G·C → A·T transitions in the next round of replication. Alkylating agents can also modify the bases of incoming nucleotides in the course of DNA synthesis.

Figure 7-7. Alkylation-induced specific mispairing.

Figure 7-7

Alkylation-induced specific mispairing. The alkylation (in this case, EMS-generated ethylation) of the O-6 position of guanine, as well as the O-4 position of thymine, can lead to direct mispairing with thymine and guanine, respectively, as shown here. (more...)

The intercalating agents are another important class of DNA modifiers. This group of compounds includes proflavin, acridine orange, and a class of chemicals termed ICR compounds (Figure 7-8a). These agents are flat planar molecules that mimic base pairs and are able to slip themselves in (intercalate) between the stacked nitrogen bases at the core of the DNA double helix (Figure 7-8b). In this intercalated position, an agent can cause single-nucleotide-pair insertions or deletions. Intercalating agents may also stack between bases in single-stranded DNA; in so doing, they may stabilize bases that are looped out during frameshift formation, as depicted in the Streisinger model (Figure 7-9).

Figure 7-8. Intercalating agents.

Figure 7-8

Intercalating agents. (a) Structures of the common agents proflavin, acridine orange, and ICR-191. (b) An intercalating agent slips between the nitrogenous bases stacked at the center of the DNA molecule. This occurrence can lead to single-nucleotide-pair (more...)

Figure 7-9. A simplified version of the Streisinger model for frameshift formation.

Figure 7-9

A simplified version of the Streisinger model for frameshift formation. (a) to (c) In DNA synthesis, the newly synthesized strand slips, looping out one or several bases. This loop is stabilized by the pairing afforded by the repetitive-sequence unit (more...)

Base damage

A large number of mutagens damage one or more bases, so no specific base pairing is possible. The result is a replication block, because DNA synthesis will not proceed past a base that cannot specify its complementary partner by hydrogen bonding. In bacterial cells, such replication blocks can be bypassed by inserting nonspecific bases. The process requires the activation of a special system, the SOS system (Figure 7-10). The name SOS comes from the idea that this system is induced as an emergency response to prevent cell death in the presence of significant DNA damage. SOS induction is a last resort, allowing the cell to trade death for a certain level of mutagenesis.

Figure 7-10. DNA polymerase III, shown in blue, stops at a non-coding lesion, such as the T·C photodimer shown here, generating single-stranded regions that attract the Ssb protein (dark purple) and RecA (light purple), which forms filaments.

Figure 7-10

DNA polymerase III, shown in blue, stops at a non-coding lesion, such as the T·C photodimer shown here, generating single-stranded regions that attract the Ssb protein (dark purple) and RecA (light purple), which forms filaments. The presence (more...)

Exactly how the SOS bypass system functions is not clear, although in E. coli it is known to be dependent on at least three genes, recA (which also has a role in general recombination), umuC, and umuD. Current models for SOS bypass suggest that the UmuC and UmuD proteins combine with the polymerase III DNA replication complex to loosen its otherwise strict specificity and permit replication past noncoding lesions.

Figure 7-10 shows a model for the bypass system operating after DNA polymerase III stalls at a type of damage called a TC photodimer. Because replication can restart downstream from the dimer, a single-stranded region of DNA is generated. This attracts the stabilizing protein, called single-stranded-binding protein (Ssb), as well as the RecA protein, which forms filaments and signals the cell to synthesize the UmuC and UmuD proteins. The UmuD protein binds to the filaments and is cleaved by the RecA protein to yield a shortened version termed UmuD′, which then recruits the UmuC protein to form a complex that allows DNA polymerase III to continue past the dimer, adding bases across from the dimer with a high error frequency (see Figure 7-10).

Therefore mutagens that damage specific base-pairing sites are dependent on the SOS system for their action. The category of SOS-dependent mutagens is important, because it includes most cancer-causing agents (carcinogens), such as ultraviolet (UV) light and aflatoxin B1.

How does the SOS system take part in the recovery of mutations after mutagenesis? Does the SOS system lower the fidelity of DNA replications so much (to permit the bypass of noncoding lesions) that many replication errors occur, even for undamaged DNA? If this hypothesis were correct, most mutations generated by different SOS-dependent mutagens would be similar, rather than specific to each mutagen. Most mutations would result from the action of the SOS system itself on undamaged DNA. The mutagen, then, would play the indirect role of inducing the SOS system. Studies of mutational specificity, however, have shown that this is not the case. Instead, a series of different SOS-dependent mutagens have markedly different specificities. Each mutagen induces a unique distribution of mutations. Therefore, the mutations must be generated in response to specific damaged base pairs. The type of lesion differs in many cases. Some of the most widely studied lesions include UV photoproducts and apurinic sites.

Ultraviolet (UV) light generates a number of photoproducts in DNA. Two different lesions that unite adjacent pyrimidines in the same strand have been most strongly correlated with mutagenesis. These lesions are the cyclobutane pyrimidine photodimer and the 6-4 photoproduct (Figure 7-11 on the following page). These lesions interfere with normal base pairing; hence, induction of the SOS system is required for mutagenesis. The insertion of incorrect bases across from UV photoproducts is at the 3′ position of the dimer, and more frequently for 5′-CC-3′ and 5′-TC-3′ dimers. The C → T transition is the most frequent mutation, but other base substitutions (transversions) and frameshifts also are induced by UV light, as are larger duplications and deletions.

Figure 7-11. (a) Structure of a cyclobutane pyrimidine dimer.

Figure 7-11

(a) Structure of a cyclobutane pyrimidine dimer. Ultraviolet light stimulates the formation of a four-membered cyclobutyl ring (green) between two adjacent pyrimidines on the same DNA strand by acting on the 5,6 double bonds. (b) Structure of the 6-4 (more...)

Aflatoxin B1 (AFB1) is a powerful carcinogen originally isolated from fungal-infected peanuts. Aflatoxin forms an addition product at the N-7 position of guanine (Figure 7-12). This product leads to the breakage of the bond between the base and the sugar, thereby liberating the base and resulting in an apurinic site (Figure 7-13). Studies with apurinic sites generated in vitro have demonstrated that the SOS bypass of these sites leads to the preferential insertion of an adenine across from an apurinic site. This predicts that agents that cause depurination at guanine residues should preferentially induce G·C → T·A transversions. For example, with 0 (zero) representing an apurinic site,

Image ch7e4.jpg

Figure 7-12. The binding of metabolically activated aflatoxin B1 to DNA.

Figure 7-12

The binding of metabolically activated aflatoxin B1 to DNA.

Figure 7-13. The loss of a purine residue (guanine) from a single strand of DNA.

Figure 7-13

The loss of a purine residue (guanine) from a single strand of DNA. The sugar-phosphate backbone is left intact.


Mutagens induce mutations by a variety of mechanisms. Some mutagens mimic normal bases and are incorporated into DNA, where they can mispair. Others damage bases and either cause specific mispairing or destroy pairing by causing nonrecognition of bases.

Mechanisms of Spontaneous Mutation

The origin of spontaneous hereditary change has always been a topic of considerable interest (see Genetics in Process 7-1). It is known now that spontaneous mutations arise from a variety of sources, including errors in DNA replication, spontaneous lesions, and transposable genetic elements. The first two are discussed in this section; the third is examined in Chapter 13.

Box Icon


Genetics In Process 7-1: Salvador Luria and Max Delbruck show that bacterial mutations are random.

Spontaneous mutations are very rare, making it difficult to determine the underlying mechanisms. How then do we have insight into the processes governing spontaneous mutation? Even though they are rare, some selective systems allow numerous spontaneous mutations to be obtained and then characterized at the molecular level—for example, their DNA sequences can be determined. From the nature of the sequence changes, inferences can be made about the processes that have led to the spontaneous mutations.

Errors in DNA replication

Mispairing in the course of replication is a source of spontaneous base substitution. (Mispairing was covered earlier in the discussion of 5-BU.) Most mispairing mutations are transitions. This is likely to be because an A·C or G·T mispair does not distort the DNA double helix as much as A·G or C·T base pairs do. However, transversions also can occur through mispairing. Replication errors can also lead to frameshift mutations. The nucleotide sequence surrounding frameshift mutation hot spots was determined in the lysozyme gene of phage T4. These mutations often occur at repeated bases. The Streisinger model (Figure 7-9) proposes that frameshifts arise when loops in single-stranded regions are stabilized by the “slipped mispairing” of repeated sequences during replication. Additionally, in the E. coli lacI gene, certain hot spots result from repeated sequences, just as predicted by the Streisinger model. Figure 7-14 depicts the distribution of spontaneous mutations in the lacI gene. Note how one or two mutational sites dominate the distribution. In lacI, a four-base-pair sequence (CTGG) repeated three times in tandem in the wild type is the cause of the hot spots (for simplicity, only one strand of the DNA is shown):

Image ch7e5.jpg

Figure 7-14. The distribution of 140 spontaneous mutations in lacI.

Figure 7-14

The distribution of 140 spontaneous mutations in lacI. Each occurrence of a point mutation is indicated by a box. Red boxes designate fast-reverting mutations. Deletions (gold) are represented below. The I map is given in terms of the amino acid number (more...)

The major hot spot, represented here by the mutations FS5, FS25, FS45, and FS65, results from the addition of one extra set of the four bases CTGG. The minor hot spot, represented here by the mutations FS2 and FS84, results from the loss of one set of the four bases CTGG.

How can we explain these observations? The Streisinger model predicts that the frequency of a particular frameshift depends on the number of base pairs that can form during the slipped mispairing of repeated sequences. The wild-type sequence shown for the lacI gene can slip out one CTGG sequence and stabilize this structure by forming nine base pairs (apply the model in Figure 7-9 to the sequence shown for lacI). Whether a deletion or an addition is generated depends on whether the slippage is on the template or on the newly synthesized strand, respectively.

Larger deletions (more than a few base pairs) constitute a sizable fraction of spontaneous mutations, as shown in Figure 7-14. Most, although not all, of the deletions are of repeated sequences. Figure 7-15 shows the first 12 deletions analyzed at the DNA sequence level. Further studies have shown that the longer repeats constitute hot spots for deletions. Duplications of segments of DNA have been observed in many organisms. Like deletions, they often occur at sequence repeats.

Figure 7-15. Deletions in lacI.

Figure 7-15

Deletions in lacI. Deletions occurring in S74 and S112 are shown at the top of the diagram. As indicated by the gold bars, one of the sequence repeats (green) and all the intervening DNA has been deleted, leaving one copy of the repeated sequence. All mutations (more...)

How do these deletions and duplications form? Several mechanisms could account for their formation. Deletions may be generated as replication errors. For example, an extension of the Streisinger model of slipped mispairing (Figure 7-14) could explain why deletions predominate at short repeated sequences. Alternatively, deletions and duplications could be generated by recombination between the repeats.

Spontaneous lesions

Naturally occurring damage to the DNA, called spontaneous lesions, also can generate mutations. Two of the most frequent spontaneous lesions are depurination and deamination, the former being more common.

We learned earlier that aflatoxin induces depurination; however, depurination also occurs spontaneously. A mammalian cell spontaneously loses about 10,000 purines from its DNA during a 20-hour cellgeneration period at 37°C. If these lesions were to persist, they would result in significant genetic damage because, during replication, the apurinic sites cannot specify any kind of base, let alone the correct one. However, as mentioned earlier in the chapter, under certain conditions, a base can be inserted across from an apurinic site, frequently resulting in a mutation.

The deamination of cytosine yields uracil (Figure 7-16a). Unrepaired uracil residues will pair with adenine in the course of replication, resulting in the conversion of a G·C pair into an A·T pair (a G·C → A·T transition). Deaminations at certain cytosine positions have been found to be one type of mutational hot spot. DNA sequence analysis of hot spots for G·C → A·T transitions in the lacI gene has shown that 5-methylcytosine residues are present at the position of each hot spot. (Certain bases in prokaryotes and eukaryotes are normally methylated.) Some of the data from this lacI study are shown in Figure 7-17. The height of each bar on the graph represents the frequency of mutations at each of a number of sites. It can be seen that the positions of 5-methylcytosine residues correlate nicely with the most mutable sites.

Figure 7-16. Deamination of (a) cytosine and (b) 5-methylcytosine.

Figure 7-16

Deamination of (a) cytosine and (b) 5-methylcytosine.

Figure 7-17. 5-Methylcytosine hot spots in E.

Figure 7-17

5-Methylcytosine hot spots in E. coli. Nonsense mutations occurring at 15 different sites in lacI were scored. All result from the G·C → A·T transition. The asterisks (*) mark the positions of 5-methylcytosines. (more...)

How can 5-methylcytosines lead to mutations? One of the repair enzymes in the cell, uracil-DNA glycosylase, recognizes the uracil residues in the DNA that arise from deaminations and excises them, leaving a gap that is subsequently filled in. However, the deamination of 5-methylcytosine (Figure 7-16b) generates thymine (5-methyluracil), which is not recognized by the enzyme uracil-DNA glycosylase and is thus not repaired. Therefore, C → T transitions generated by deamination are seen more frequently at 5-methylcytosine sites, because they escape this repair system.

Oxidatively damaged bases constitute a third type of spontaneous lesion implicated in mutagenesis. Active oxygen species, such as superoxide radicals (O2D), hydrogen peroxide (H2O2), and hydroxyl radicals (OHD), are produced as by-products of normal aerobic metabolism. These oxygen species can cause oxidative damage to DNA, as well as to precursors of DNA (such as GTP), resulting in mutation. Such mutations have been implicated in a number of human diseases. Figure 7-18 shows two products of oxidative damage. The 8-oxo-7-hydrodeoxyguanosine (8-oxodG, or “GO”) product frequently mispairs with A, resulting in a high level of G → T transversions.

Figure 7-18. DNA damage products formed after attack by oxygen radicals.

Figure 7-18

DNA damage products formed after attack by oxygen radicals. dR = deoxyribose.


Spontaneous mutations can be generated by several different processes. Replication errors and spontaneous lesions generate most of the base-substitution and frameshift mutations.

Spontaneous mutations and human diseases

DNA sequence analysis has revealed the mutational lesions responsible for a number of human hereditary diseases. Many are of the expected simple base-substitution or deletion or addition type. However, some are more complex but reminiscent of previously discussed bacterial mutations, allowing us to suggest mechanisms that cause these human disorders.

A number of these disorders are due to deletions or duplications of repeated sequences. For example, mitochondrial encephalomyopathies are a group of disorders affecting the central nervous system or the muscles (for example, Kearns-Sayre syndrome). They are characterized by dysfunction of mitochondrial oxidative phosphorylation and by changes in mitochondrial DNA structure. These disorders have been shown to result from deletions of DNA sequences that lie between repeated sequences. Figure 7-19 depicts one of these deletions. Note how similar it is in form to the spontaneous E. coli deletions shown in Figure 7-15. A second example is Fabry disease. This inborn error of metabolism results from mutations in the X-linked gene encoding the enzyme α-galactosidase A. Many of these mutations are gene rearrangements, resulting from either deletions or duplications between short direct repeats. All these deletions occur either by a slipped mispairing mechanism, such as that pictured in Figure 7-9, or by recombination between the repeated sequences.

Figure 7-19. Sequences of wild-type (WT) mitochondrial DNA and of deleted DNA (KS) from a patient with Kearns-Sayre/chronic external opthalmoplegia plus syndrome.

Figure 7-19

Sequences of wild-type (WT) mitochondrial DNA and of deleted DNA (KS) from a patient with Kearns-Sayre/chronic external opthalmoplegia plus syndrome. The 13-base boxed sequence is identical in both WT and KS and serves as a breakpoint for the DNA deletion. (more...)

A common mechanism responsible for a number of genetic diseases is the expansion of a three-base-pair repeat, as in the fragile X syndrome (Figure 7-20). For this reason, they are termed trinucleotide repeat diseases. Fragile X syndrome is the most common form of inherited mental retardation, occurring in close to 1 of 1500 males and 1 of 2500 females. It is manifested cytologically by a fragile site in the X chromosome that results in breaks in vitro. Fragile X syndrome results from changes in the number of a (CGG) n repeat in the coding sequence of the FMR-1 gene. How does repeat number correlate with the disease phenotype? Humans normally show a considerable variation in the number of CGG repeats in the FMR-1 gene, ranging from 6 to 54, with the most frequent allele containing 29 repeats. The variation in the number of CGG repeats produces a corresponding variation in the number of arginine residues (CGG is an arginine codon) in the FMR-1- encoded protein. Sometimes, unaffected parents and grandparents give rise to several offspring with fragile X syndrome. Such unaffected ancestors in a pedigree have been found to contain increased copy numbers of the repeat, ranging from 50 to 200. For this reason, these ancestors have been said to carry premutations. The repeats in these premutation alleles are not sufficient to cause the disease phenotype, but they are much more unstable (that is, readily expanded) than normal alleles, and so they lead to even greater expansion in their offspring. (In general, it appears that the more expanded the repeat number, the greater the instability.) The people with the symptoms of the disease have enormous repeat numbers, ranging from 200 to 1300.

Figure 7-20. Expansion of the CGG triplet in the FMR-1 gene seen in the fragile X syndrome.

Figure 7-20

Expansion of the CGG triplet in the FMR-1 gene seen in the fragile X syndrome. Normal persons have from 6 to 54 copies of the CGG repeat, whereas those from susceptible families display an increase (premutation) in the number of repeats: normally transmitting (more...)

The proposed mechanism for these repeats is a slipped mispairing during DNA synthesis, just as shown previously for the lacI hot spot involving a one-step expansion of the four-base-pair sequence CTGG. However, the extraordinarily high frequency of mutation at the trinucleotide repeats in fragile X syndrome suggests that in human cells, after a threshold level of about 50 repeats, the replication machinery cannot faithfully replicate the correct sequence and large variations in repeat numbers result.

Other diseases also have been associated with expansion of trinucleotide repeats. There are several general themes to these diseases. The wild-type gene includes a repeated sequence within its protein-coding region, and mutation correlates with this repeat region’s undergoing a considerable expansion. The severity of the disease correlates with the number of repeat copies. Taken together, these observations suggest that the expanded repeats are indeed parts of polypeptides and that the abnormal polypeptides containing large repeats of a single amino acid somehow contribute to the disease state.

X-linked spinal and bulbar muscular atrophy (known as Kennedy disease) results from the amplification of a three-base-pair repeat—in this case, a repeat of CAG. Kennedy disease, which is characterized by progressive muscle weakness and atrophy, results from mutations in the gene that codes for the androgen receptor. Normal persons have an average of 21 CAG repeats in this gene, whereas affected patients have repeats ranging from 40 to 52.

Myotonic dystrophy, the most common form of adult muscular dystrophy, is yet another example of sequence expansion causing a human disease. Susceptible families display an increase in severity of the disease in successive generations; this increased severity is caused by the progressive amplification of a CTG triplet at the 3′ end of a transcript. Normal people possess, on average, five copies of the CTG repeat; mildly affected people have approximately 50 copies; and severely affected people have more than 1000 repeats of the CTG triplet. Additional examples of triplet expansion are still appearing—for instance, Huntington disease, which has recently been added to the list.

Biological Repair Mechanisms

Living cells have evolved a series of enzymatic systems that repair DNA damage in a variety of ways. The low spontaneous mutation rate is indicative of the efficiency of these repair systems. Failure of these systems can lead to a higher mutation rate. A number of human diseases can be attributed to defects in DNA repair, as we shall see later. Let’s first examine some of the characterized repair pathways and then consider how the cell integrates these systems into an overall strategy for repair.

We can divide repair pathways into several categories: prevention of errors, reversal of damage, excision repair, and postreplication repair.

Prevention of errors

Some enzymatic systems neutralize potentially damaging compounds before they even react with DNA. One such system detoxifies superoxide radicals produced during oxidative damage to DNA. The enzyme superoxide dismutase catalyzes the conversion of the superoxide radicals into hydrogen peroxide, and the enzyme catalase, in turn, converts the hydrogen peroxide into water.

Direct reversal of damage

The most straightforward way to repair a lesion is to reverse it directly, thereby regenerating the normal base. Reversal is not always possible, because some types of damage are essentially irreversible. In a few cases, however, lesions can be repaired in this way. One case is a mutagenic photodimer caused by UV light (see Figure 7-11). The cyclobutane pyrimidine photodimer can be repaired by a photolyase that has been found in bacteria and lower eukaryotes but not in humans. The enzyme binds to the photodimer and splits it, in the presence of certain wavelengths of visible light, to generate the original bases (Figure 7-21). This enzyme cannot operate in the dark, so other repair pathways are required to remove UV damage. A photolyase that reverses the 6-4 photoproducts has also been detected in plants and Drosophila.

Figure 7-21. Repair of a UV-induced pyrimidine photodimer by a photoreactivating enzyme, or photolyase.

Figure 7-21

Repair of a UV-induced pyrimidine photodimer by a photoreactivating enzyme, or photolyase. The enzyme recognizes the photodimer (here, a thymine dimer) and binds to it. When light is present, the photolyase uses its energy to split the dimer into the (more...)

Alkyltransferases also are enzymes that directly reverse lesions. They remove certain alkyl groups that have been added to the O-6 positions of guanine (Figure 7-7) by such mutagens as nitrosoguanidine and ethyl methanesulfonate. The methyltransferase from E. coli has been well studied. This enzyme transfers the methyl group from O-6-methylguanine to a cysteine residue on the protein. When this happens, the enzyme is inactivated, so this repair system can be saturated if the level of alkylation is high enough.

Excision-repair pathways

The general excision-repair system breaks a phosphodiester bond on either side of the lesion, on the same strand, resulting in the excision of an oligonucleotide. This leaves a gap that is filled by repair synthesis, and a ligase seals the breaks. In prokaryotes, 12 or 13 nucleotides are removed, whereas, in eukaryotes, from 27 to 29 nucleotides are eliminated. Figure 7-22 depicts the excision pattern in each case.

Figure 7-22. Excinuclease incision patterns by E.

Figure 7-22

Excinuclease incision patterns by E. coli (left) and human enzymes. The red points indicate the incision patterns of a lesion, in this case a thymine dimer, which is shown in orange. (Courtesy of J. E. Hearst in A. Sancar, Science 266, 1974, 1954.)

Certain lesions are too subtle to cause a distortion large enough to be recognized by the general excision-repair system and its counterparts in higher cells. Thus, additional specific excision pathways are necessary. Base-excision repair is carried out by DNA glycosylases that cleave N-glycosidic (base–sugar) bonds, thereby liberating the altered bases and generating apurinic or apyrimidinic sites (AP sites; see Figure 7-13). The initial step in this process is shown in Figure 7-23. The resulting site is then repaired by an AP site-specific endonuclease repair pathway.

Figure 7-23. Action of DNA glycosylases.

Figure 7-23

Action of DNA glycosylases. Glycosylase removes an altered base and leaves an AP site. The AP site is subsequently excised by the AP endonucleases diagrammed in Figure 7-24. (After B. Lewin, Genes. Copyright © 1983 by John Wiley.)

Numerous DNA glycosylases exist. One, uracil-DNA glycosylase, removes uracil from DNA. Uracil residues, which result from the spontaneous deamination of cytosine (Figure 7-16), can lead to a C → T transition if unrepaired. It is possible that the natural pairing partner of adenine in DNA is thymine (5-methyluracil) rather than uracil so as to allow the recognition and excision of these uracil residues. If uracil were a normal constituent of DNA, such repair would not be possible.

All cells have endonucleases that attack the sites left after the spontaneous loss of single purine or pyrimidine residues. The AP endonucleases are vital to the cell, because, as noted earlier, spontaneous depurination is a relatively frequent event. These enzymes introduce chain breaks by cleaving the phosphodiester bonds at AP sites. This initiates an excision-repair process mediated by three further enzymes—an exonuclease, DNA polymerase I, and DNA ligase (Figure 7-24).

Figure 7-24. Repair of AP (apurinic or apyrimidinic) sites.

Figure 7-24

Repair of AP (apurinic or apyrimidinic) sites. AP endonucleases recognize AP sites and cut the phosphodiester bond. A stretch of DNA is removed by an exonuclease, and the resulting gap is filled in by DNA polymerase I and DNA ligase. (After B. Lewin, (more...)

Owing to the efficiency of the AP endonuclease repair pathway, it can be the final step of other repair pathways. Thus, if damaged base pairs can be excised, leaving an AP site, the AP endonucleases can complete the restoration to the wild type. This is what happens in the DNA glycosylase repair pathway.

Postreplication repair

Some repair pathways are capable of recognizing errors even after DNA has already undergone replication. One example, termed the mismatch-repair system, can detect such mismatches. Mismatch-repair systems have to do at least three things:


Recognize mismatched base pairs.


Determine which base in the mismatch is the incorrect one.


Excise the incorrect base and carry out repair synthesis.

The second property is the crucial one of such a system. Unless it is capable of discriminating between the correct and the incorrect bases, the mismatchrepair system cannot determine which base to excise to prevent a mutation from arising. If, for example, a G·T mismatch occurs as a replication error, how can the system determine whether G or T is incorrect? Both are normal bases in DNA. But replication errors produce mismatches on the newly synthesized strand, so it is the base on this strand that must be recognized and excised.

To distinguish the old, template strand from the newly synthesized strand, the mismatch-repair system, best characterized in bacteria, takes advantage of a delay in the methylation of the following sequence, which normally occurs after replication:

Image ch7e6.jpg

The methylating enzyme is adenine methylase, which creates 6-methyladenine on each strand. However, it takes the adenine methylase several minutes to recognize and modify the newly synthesized GATC stretches. During that interval, the mismatch-repair system can operate because it can now distinguish the old strand from the new one by the methylation pattern. Methylating the 6 position of adenine does not affect base pairing, and it provides a convenient tag that can be detected by other enzyme systems. Figure 7-25 shows the replication fork during mismatch correction. Note that only the old strand is methylated at GATC sequences right after replication. Once the mismatched site has been identified, the mismatch-repair system corrects the error.

Figure 7-25. Model for mismatch repair in E.

Figure 7-25

Model for mismatch repair in E. coli. Because DNA is methylated by enzymatic reactions that recognize the A in a GATC sequence, directly after DNA replication the newly synthesized strand will not be methylated. The “hemimethylated” DNA (more...)

The E. coli recA gene, one of the genes of the SOS bypass system (Figure 7-10), also takes part in postreplication repair. Here the DNA replication system stalls at a UV photodimer or other blocking lesions and then restarts past the block, leaving a single-stranded gap. The RecA product takes part in recombinational repair, a process in which the gap is patched by DNA cut from the sister molecule (Figure 7-26a). This process is not error-prone.

Figure 7-26. Schemes for postreplication repair.

Figure 7-26

Schemes for postreplication repair. (a) In recombinational repair, replication jumps across a blocking lesion, leaving a gap in the new strand. A recA-directed protein then fills in the gap, using a piece from the opposite parental strand (because of (more...)

Strategy for repair

We can now assess the overall repair system strategy used by the cell. The many different repair systems available to the cell are summarized in Table 7-3 on page 220. It would be convenient if enzymes could be used to directly reverse each specific lesion. However, sometimes that is not chemically possible, and not every possible type of DNA damage can be anticipated. Therefore, a general excision-repair system is used to remove any type of damaged base that causes a recognizable distortion in the double helix. When lesions are too subtle to cause such a distortion, specific excision systems, such as glycosylases, or removal systems are designed. To eliminate replication errors, a postreplication mismatch-repair system operates; finally, postreplication recombinational systems eliminate gaps across from blocking lesions that have escaped the other repair systems.

Table 7-3. Repair Systems in E. coli.

Table 7-3

Repair Systems in E. coli.


Repair enzymes play a crucial role in reducing genetic damage in living cells. The cell has many different repair pathways at its command to eliminate potentially mutagenic errors.

Repair defects and human diseases

Several recessive human genetic diseases are known or suspected to be caused by defective genes in repair systems. These defects often lead to an increased incidence of cancer because, as part of the general increased level of mutation, more mutations are produced in genes that can cause a cell to become cancerous (see Chapter 15).

One cancer-prone disease, xeroderma pigmentosum (XP), results from a defect in any one of eight genes involved in nucleotide excision repair. People suffering from this disorder are extremely prone to UV-induced skin cancers (Figure 7-27) as a result of exposure to sunlight and have frequent neurological abnormalities. The difference in UV photosensitivity between normal and XP cells is evident from the survival curves in Figure 7-28.

Figure 7-27. Skin cancer in xeroderma pigmentosum.

Figure 7-27

Skin cancer in xeroderma pigmentosum. This recessive hereditary disease is caused by a deficiency in one of the excision-repair enzymes, which leads to the formation of skin cancers on exposure of the skin to the UV rays in sunlight. (Photograph courtesy (more...)

Figure 7-28. Hypersensitivity to UV radiation of XP cells in culture.

Figure 7-28

Hypersensitivity to UV radiation of XP cells in culture. Here the cells from a number of complementation groups are shown. There is a variation between complementation groups, but all are more sensitive to UV radiation than normal cells. (Adapted from (more...)

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

Copyright © 1999, W. H. Freeman and Company.
Bookshelf ID: NBK21322


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...