As with many processes in molecular biology, we conventionally look on genome replication as being made up of three phases - initiation, elongation and termination:
As well as these three stages in replication, one additional topic demands attention. This relates to a limitation in the replication process that, if uncorrected, would lead to linear double-stranded DNA molecules getting shorter each time they are replicated (see ). The solution to this problem, which concerns the structure and synthesis of the telomeres at the ends of chromosomes (
13.2.1. Initiation of genome replication
Figure 13.7
.
Bidirectional DNA replication of (A) a circular bacterial chromosome and (B) a linear eukaryotic chromosome
Initiation of replication is not a random process and always begins at the same position or positions on a DNA molecule, these points being called the
origins of replication. Once initiated, two replication forks can emerge from the origin and progress in opposite directions along the DNA: replication is therefore bidirectional with most genomes (). A circular bacterial genome has a single origin of replication, meaning that several thousand
kb of DNA are copied by each replication fork. This situation differs from that seen with eukaryotic chromosomes, which have multiple origins and whose replication forks progress for shorter distances. The yeast
Saccharomyces cerevisiae, for example, has about 300 origins, corresponding to 1 per 40
kb of DNA, and humans have some 20 000 origins, or 1 for every 150
kb.
Initiation at the E. coli origin of replication
Figure 13.8
.
The Escherichia coli origin of replication
(A) The E. coli origin of replication is called oriC and is approximately 245 bp in length. It contains three copies of a 13-nucleotide repeat motif, consensus sequence 5′-GATCTNTTNTTTT-3′ where ‘N’ is any nucleotide, and five copies of a nine-nucleotide repeat, consensus
. The 13-nucleotide sequences form a tandem array of direct repeats at one end of
oriC. The nine-nucleotide sequences are distributed through
oriC, three units forming a series of direct repeats and two units in the inverted configuration, as indicated by the arrows. Three of the nine-nucleotide repeats - numbers 1, 3 and 5 when counted from the left-hand end of
oriC as drawn here - are regarded as major sites for DnaA attachment; the other two repeats are minor sites. The overall structure of the origin is similar in all bacteria and the sequences of the repeats do not vary greatly. (B) Model for the attachment of DnaA proteins to
oriC, resulting in melting of the helix within the AT-rich 13-nucleotide sequences.
We know substantially more about initiation of replication in bacteria than in eukaryotes. The
E.
coli origin of replication is referred to as
oriC. By transferring segments of DNA from the
oriC region into plasmids that lack their own origins, it has been estimated that the
E.
coli origin of replication spans approximately 245
bp of DNA. Sequence analysis of this segment shows that it contains two short repeat motifs, one of nine nucleotides and the other of 13 nucleotides (). The nine-nucleotide repeat, five copies of which are dispersed throughout
oriC, is the binding site for a protein called DnaA. With five copies of the binding sequence, it might be imagined that five copies of DnaA attach to the origin, but in fact bound DnaA proteins cooperate with unbound molecules until some 30 are associated with the origin. Attachment occurs only when the DNA is negatively supercoiled, as is the normal situation for the
E.
coli chromosome (
Section 2.3.1).
The result of DnaA binding is that the double helix opens up (‘melts’) within the tandem array of three AT-rich, 13-nucleotide repeats located at one end of the
oriC sequence (). The exact mechanism is unknown but DnaA does not appear to possess the enzymatic activity needed to break base pairs, and it is therefore assumed that the helix is melted by torsional stresses introduced by attachment of the DnaA proteins. An attractive model imagines that the DnaA proteins form a barrel-like structure around which the helix is wound.
Melting the helix is promoted by HU, the most abundant of the DNA packaging proteins of
E. coli (
Section 2.3.1).
Melting of the helix initiates a series of events that culminates in the start of the elongation phase of replication. The first step is attachment of a complex of two proteins, DnaBC, forming the prepriming complex. DnaC has a transitory role and is released from the complex soon after it is formed, its function probably being simply to aid the attachment of DnaB. The latter is a helicase, an enzyme which can break base pairs (see Section 13.2.2). DnaB begins to increase the single-stranded region within the origin, enabling the enzymes involved in the elongation phase of genome replication to attach. This represents the end of the initiation phase of replication in E. coli as the replication forks now start to progress away from the origin and DNA copying begins.
Origins of replication in yeast have been clearly defined
Figure 13.9
.
Structure of a yeast origin of replication
(A) Structure of ARS1, a typical autonomously replicating sequence (ARS) that acts as an origin of replication in Saccharomyces cerevisiae. The relative positions of the functional sequences A, B1, B2 and B3 are shown. For more details see Bielinsky and Gerbi (1998). (B) Melting of the helix occurs within subdomain B2, induced by attachment of the ARS binding protein 1 (ABF1) to subdomain B3. The proteins of the origin replication complex (ORC) are permanently attached to subdomains A and B1.
The technique used to delineate the
E. coli oriC sequence, involving transfer of DNA segments into a non-replicating plasmid, has also proved valuable in identifying origins of replication in the yeast
Saccharomyces cerevisiae. Origins identified in this way are called
autonomously replicating sequences or
ARSs. A typical yeast
ARS is shorter than the
E.
coli origin, usually less than 200
bp in length, but, like the
E.
coli origin, contains discrete segments with different functional roles, these ‘subdomains’ having similar sequences in different
ARSs (). Four subdomains are recognized. Two of these - subdomains A and B1 - make up the
origin recognition sequence, a stretch of some 40
bp in total that is the binding site for the
origin recognition complex (
ORC), a set of six proteins that attach to the
ARS ().
ORCs have been described as yeast versions of the
E.
coli DnaA proteins (
Kelman, 2000), but this interpretation is probably not strictly correct because
ORCs appear to remain attached to yeast origins throughout the cell cycle (
Bell and Stillman, 1992;
Diffley and Cocker, 1992). Rather than being genuine initiator proteins, it is more likely that
ORCs are involved in the regulation of genome replication, acting as mediators between replication origins and the regulatory signals that coordinate the initiation of
DNA replication with the cell cycle (
Section 13.3;
Stillman, 1996).
We must therefore look elsewhere in yeast
ARSs for sequences with functions strictly equivalent to that of
oriC of
E. coli. This leads us to the two other conserved sequences in the typical yeast
ARS, subdomains B2 and B3 (see ). Our current understanding suggests that these two subdomains function in a manner similar to the
E. coli origin. Subdomain B2 appears to correspond to the 13-nucleotide repeat array of the
E. coli origin, being the position at which the two strands of the helix are first separated. This melting is induced by torsional stress introduced by attachment of a
DNA-binding protein,
ARS binding factor 1 (ABF1), which attaches to subdomain B3 (see ). As in
E. coli, melting of the helix within a yeast replication origin is followed by attachment of the helicase and other replication enzymes to the DNA, completing the initiation process and enabling the replication forks to begin their progress along the DNA, as described in
Section 13.2.2.
Replication origins in higher eukaryotes have been less easy to identify
Attempts to identify replication origins in humans and other higher eukaryotes have, until recently, been less successful (Gilbert, 2001). Initiation regions (parts of the chromosomal DNA where replication initiates) were delineated by various biochemical methods, for example by allowing replication to initiate in the presence of labeled nucleotides, then arresting the process, purifying the newly synthesized DNA, and determining the positions of these nascent strands in the genome. These experiments suggested that there are specific regions in mammalian chromosomes where replication begins, but some researchers were doubtful whether these regions contained replication origins equivalent to yeast ARSs. One alternative hypothesis was that replication is initiated by protein structures that have specific positions in the nucleus, the chromosome initiation regions simply being those DNA segments located close to these protein structures in the three-dimensional organization of the nucleus.
Doubts about mammalian replication origins were increased by the failure of mammalian initiation regions to confer replicative ability on replication-deficient plasmids, although these experiments were not considered conclusive because it was recognized that a mammalian origin might be too long to be cloned in a plasmid or might function only when activated by distant sites in the chromosomal DNA. The breakthrough eventually came when an 8 kb segment of a human initiation region was transferred to the monkey genome, where it still directed replication despite being removed from any hypothetical protein structure in the human nucleus (Aladjem et al., 1998). Analysis of this transferred initiation region showed that there are primary sites within the region where initiation occurs at high frequency, surrounded by secondary sites, spanning the entire 8 kb region, at which replication initiates with lower frequency. The presence of discrete functional domains within the initiation region could also be demonstrated by examining the effects of deletions of parts of the region on the efficiency of replication initiation.
The demonstration that the human genome contains replication origins equivalent to those in yeast raises the question of whether mammals possess an equivalent of the yeast ORC. The answer appears to be yes, as several genes whose protein products have similar sequences to the yeast ORC proteins have been identified in higher eukaryotes and some of these have been shown to be able to replace the equivalent yeast protein in the yeast ORC (Carpenter et al., 1996). These results indicate that initiation of replication in yeast is a good model for events occurring in mammals, a conclusion that is very relevant to studies of the control of replication initiation, as we will see in Section 13.3.
13.2.2. The elongation phase of replication
Figure 13.10
.
Template-dependent synthesis of DNA
Compare this reaction with template-dependent synthesis of RNA, shown in Figure 3.5.
Figure 13.11
.
Complications with DNA replication
Two complications have to be solved when double-stranded DNA is replicated. First, only the leading strand can be continuously replicated by 5′→3′ DNA synthesis; replication of the lagging strand has to be carried out discontinuously. Second, initiation of DNA synthesis requires a primer. This is true both of cellular DNA synthesis, as shown here, and DNA synthesis reactions that are carried out in the test tube (Section 4.1.1).
Once replication has been initiated, the replication forks progress along the DNA and participate in the central activity of genome replication - the synthesis of new strands of DNA that are complementary to the parent polynucleotides. At the chemical level the template-dependent synthesis of DNA () is very similar to the template-dependent synthesis of RNA that occurs during transcription (compare with
Figure 3.5). However, this similarity should not mislead us into making an extensive analogy between transcription and replication. The mechanics of the two processes are quite different, replication being complicated by two factors that do not apply to transcription:
-
During DNA replication both strands of the double helix must be copied. This is an important complication because, as noted in Section 1.1.2, DNA polymerase enzymes are only able to synthesize DNA in the 5′→3′ direction. This means that one strand of the parent double helix, called the leading strand, can be copied in a continuous manner, but replication of the lagging strand has to be carried out in a discontinuous fashion, resulting in a series of short segments that must be ligated together to produce the intact daughter strand (). -
The second complication arises because template-dependent DNA polymerases cannot initiate DNA synthesis on a molecule that is entirely single-stranded: there must be a short double-stranded region to provide a 3′ end onto which the enzyme can add new nucleotides. This means that primers are needed, one to initiate complementary strand synthesis on the leading polynucleotide, and one for every segment of discontinuous DNA synthesized on the lagging strand ().
Before dealing with these two complications we will first examine the DNA polymerase enzymes themselves.
The DNA polymerases of bacteria and eukaryotes
Table 13.2
DNA polymerases involved in replication of bacterial and eukaryotic genomes
Figure 13.17
.
The series of events involved in joining up adjacent Okazaki fragments during DNA replication in Escherichia coli
DNA polymerase III lacks a 5′→3′ exonuclease activity and so stops making DNA when it reaches the RNA primer of the next Okazaki fragment. At this point DNA synthesis is continued by DNA polymerase I, which does have a 5′→3′ exonuclease activity, and which works in conjunction with RNase H to remove the RNA primer and replace it with DNA. DNA polymerase I usually also replaces some of the DNA from the Okazaki fragment before detaching from the template. This leaves a single missing phosphodiester bond, which is synthesized by DNA ligase, completing this step in the replication process.
The principal chemical reaction catalyzed by a
DNA polymerase is the 5′→3′ synthesis of a DNA polynucleotide, as shown in . We learnt in
Section 4.1.1 that some
DNA polymerases combine this function with at least one exonuclease activity, which means that these enzymes can degrade polynucleotides as well as synthesize them (see
Figure 4.7):
-
A 3′→5′ exonuclease is possessed by many bacterial and eukaryotic template-dependent DNA polymerases (Table 13.2). This activity enables the enzyme to remove nucleotides from the 3′ end of the strand that it has just synthesized. It is looked on as a proofreading activity whose function is to correct the occasional base-pairing error that might occur during strand synthesis (see Section 14.1.1). -
A 5′→3′ exonuclease activity is less common but is possessed by some polymerases whose function in replication requires that they must be able to remove at least part of a polynucleotide that is already attached to the template strand that the polymerase is copying. This activity is utilized during the process that joins together the discontinuous DNA fragments synthesized on the lagging strand during bacterial DNA replication (see ).
The search for DNA polymerases began in the mid-1950s, as soon as it was realized that DNA synthesis was the key to replication of genes. It was thought that bacteria would probably have just a single DNA polymerase, and when the enzyme now called DNA polymerase I was isolated by Arthur Kornberg in 1957 there was a widespread assumption that this was the main replicating enzyme. The discovery that inactivation of the E.
coli polA gene, which codes for DNA polymerase I, was not lethal (cells were still able to replicate their genomes) therefore came as something of a surprise, especially when a similar result was obtained with inactivation of polB, coding for a second enzyme, DNA polymerase II, which we now know is mainly involved in repair of damaged DNA rather than genome replication (Section 14.2.5). It was not until 1972 that the main replicating polymerase of E.
coli, DNA polymerase III, was eventually isolated. Both DNA polymerases I and III are involved in genome replication, as we will see in the next section.
The properties of the two
E.
coli DNA polymerases involved in genome replication are described in
Table 13.2.
DNA polymerases I and II are single polypeptides but
DNA polymerase III, befitting its role as the main replicating enzyme, is multi-subunit, with a molecular mass of approximately 900
kDa. The three main subunits, which form the core enzyme, are called α, ε and θ, with the polymerase activity specified by the α subunit and the 3′→5′ exonuclease by ε. The function of θ is not clear: it may have a purely structural role in bringing together the other two core subunits and in assembling the various accessory subunits. The latter include τ and γ, both coded by the same gene, with synthesis of γ involving translational frameshifting (
Section 11.2.3), β, which acts as a ‘sliding clamp’ and holds the polymerase complex tightly to the template, δ, δ′, χ and ψ.
Figure 13.12
.
Priming of DNA synthesis in (A) bacteria and (B) eukaryotes
In eukaryotes the primase forms a complex with DNA polymerase α, which is shown synthesizing the RNA primer followed by the first few nucleotides of DNA.
Eukaryotes have at least nine
DNA polymerases (
Hübscher et al., 2000), which in mammals are distinguished by Greek suffices (α, β, γ, δ, etc.), an unfortunate choice of nomenclature as it tempts confusion with the identically named subunits of
E.
coli DNA polymerase III. The main replicating enzyme is
DNA polymerase δ (
Table 13.2), which has two subunits (three according to some researchers) and works in conjunction with an accessory protein called the
proliferating cell nuclear antigen (
PCNA).
PCNA is the functional equivalent of the β subunit of
E.
coli DNA polymerase III and holds the enzyme tightly to the template.
DNA polymerase α also has an important function in DNA synthesis, being the enzyme that primes eukaryotic replication (see ).
DNA polymerase γ, although coded by a nuclear gene, is responsible for replicating the mitochondrial genome.
Discontinuous strand synthesis and the priming problem
The limitation that
DNA polymerases can synthesize polynucleotides only in the 5′→3′ direction means that the lagging strand of the parent molecule must be copied in a discontinuous fashion, as shown in . The implication of this model - that the initial products of lagging-strand replication are short segments of polynucleotide - was confirmed in 1969 when
Okazaki fragments, as these segments are now called, were first isolated from
E.
coli (
Okazaki and Okazaki, 1969). In bacteria,
Okazaki fragments are 1000–2000 nucleotides in length, but in eukaryotes the equivalent fragments appear to be much shorter, perhaps less than 200 nucleotides. This is an interesting observation that might indicate that each round of discontinuous synthesis replicates the DNA associated with a single nucleosome (140 and 150
bp wound around the core particle plus 50–70
bp of linker DNA:
Section 2.2.1).
The second difficulty illustrated in is the need for a primer to initiate synthesis of each new polynucleotide. It is not known for certain why
DNA polymerases cannot begin synthesis on an entirely single-stranded template, but it may relate to the proofreading activity of these enzymes, which is essential for the accuracy of replication. As described in
Section 14.1.1, a nucleotide that has been inserted incorrectly at the extreme 3′ end of a growing DNA strand, and hence is not base-paired to the template polynucleotide, can be removed by the 3′→5′ exonuclease activity of a
DNA polymerase. This means that the 3′→5′ exonuclease activity must be more effective than the 5′→3′ polymerase activity when the 3′ nucleotide is not base-paired to the template. The implication is that the polymerase can extend a polynucleotide efficiently only if its 3′ nucleotide is base-paired, which in turn could be the reason why an entirely single-stranded template, which by definition lacks a base-paired 3′ nucleotide, cannot be used by a
DNA polymerase.
Whatever the reason, priming is a necessity in
DNA replication but does not present too much of a problem. Although
DNA polymerases cannot deal with an entirely single-stranded template,
RNA polymerases have no difficulty in this respect, so the primers for
DNA replication are made of RNA. In bacteria, primers are synthesized by
primase, a special
RNA polymerase unrelated to the transcribing enzyme, with each primer 4–15 nucleotides in length (
Frick and Richardson, 2001). Once the primer has been completed, strand synthesis is continued by
DNA polymerase III (). In eukaryotes the situation is slightly more complex because the primase is tightly bound to
DNA polymerase α, and cooperates with this enzyme in synthesis of the first few nucleotides of a new polynucleotide. This primase synthesizes an RNA primer of 8–12 nucleotides, and then hands over to
DNA polymerase α, which extends the RNA primer by adding about 20 nucleotides of DNA. This DNA stretch often has a few ribonucleotides mixed in, but it is not clear if these are incorporated by
DNA polymerase α or by intermittent activity of the primase. After completion of the RNA-DNA primer, DNA synthesis is continued by the main replicative enzyme,
DNA polymerase δ ().
Priming needs to occur just once on the leading strand, within the replication origin, because once primed, the leading-strand copy is synthesized continuously until replication is completed. On the lagging strand, priming is a repeated process that must occur every time a new Okazaki fragment is initiated. In E.
coli, which makes Okazaki fragments of 1000–2000 nucleotides in length, approximately 4000 priming events are needed every time the genome is replicated. In eukaryotes the Okazaki fragments are much shorter and priming is a highly repetitive event.
Events at the bacterial replication fork
Now that we have considered the complications introduced by discontinuous strand synthesis and the priming problem, we can move on to study the combination of events occurring at the replication fork during the elongation phase of genome replication.
In Section 13.2.1 we identified attachment of the DnaB helicase, followed by extension of the melted region of the replication origin, as representing the end of the initiation phase of replication in E.
coli. To a large extent, the division between initiation and elongation is artificial, the two processes running seamlessly one into the other. After the helicase has bound to the origin to form the prepriming complex, the primase is recruited, resulting in the primosome, which initiates replication of the leading strand. It does this by synthesizing the RNA primer that DNA polymerase III needs in order to begin copying the template.
Figure 13.13
.
The role of the DnaB helicase during DNA replication in Escherichia coli
DnaB is a 5′→3′ helicase and so migrates along the lagging strand, breaking base pairs as it goes. It works in conjunction with a
DNA topoisomerase (see ) to unwind the helix. To avoid confusion, the primase enzyme normally associated with the DnaB helicase is not shown in this drawing.
DnaB is the main helicase involved in genome replication in
E. coli, but it is by no means the only helicase that this bacterium possesses: in fact there were eleven at the last count (
van Brabant et al., 2000). The size of the collection reflects the fact that DNA unwinding is required not only during replication but also during diverse processes such as transcription, recombination and
DNA repair. The mode of action of a typical helicase has not been precisely defined, but it is thought that these enzymes bind to single-stranded rather than double-stranded DNA, and migrate along the polynucleotide in either the 5′→3′ or 3′→5′ direction, depending on the specificity of the helicase. Breakage of base pairs in advance of the helicase requires energy, which is generated by hydrolysis of
ATP. According to this model, a single DnaB helicase could migrate along the lagging strand (DnaB is a 5′→3′ helicase), unzipping the helix and generating the replication fork, the torsional stress generated by the unwinding activity being relieved by
DNA topoisomerase action (). This model is probably a good approximation of what actually happens, although it does not provide a function for the two other
E. coli helicases thought to be involved in genome replication. Both of these, PriA and Rep, are 3′→5′ helicases and so could conceivably complement DnaB activity by migrating along the leading strand, but they may have lesser roles. The involvement of Rep in
DNA replication might in fact be limited to participation in the rolling circle process used by λ and a few other
E. coli bacteriophages (
Section 13.1.3).
Figure 13.14
.
The role of single-strand binding proteins (SSBs) during DNA replication
(A) SSBs attach to the unpaired polynucleotides produced by helicase action and prevent the strands from base-pairing with one another or being degraded by single-strand-specific nucleases. (B) Structure of the eukaryotic SSB called RPA. The protein contains a β-sheet structure that forms a channel in which the DNA (shown in orange, viewed from the end) is bound. Reproduced with permission from Bochkarev et al., Nature
385, 176–181. Copyright 1997 Macmillan Magazines Limited. Image supplied courtesy of Dr Lori Frappier, Department of Medical Genetics and Microbiology at the University of Toronto, Canada.
Single-stranded DNA is naturally ‘sticky’ and the two separated polynucleotides produced by helicase action would immediately reform base pairs after the enzyme has passed, if allowed to. The single strands are also highly susceptible to nuclease attack and are likely to be degraded if not protected in some way. To avoid these unwanted outcomes,
single-strand binding proteins (
SSBs) attach to the polynucleotides and prevent them from reassociating or being degraded (). The
E.
coli SSB is made up of four identical subunits and probably works in a similar way to the major eukaryotic
SSB, called
replication protein A
(RPA), by enclosing the polynucleotide in a channel formed by a series of
SSBs attached side by side on the strand (;
Bochkarev et al., 1997). Detachment of the
SSBs, which must occur when the replication complex arrives to copy the single strands, is brought about by a second set of proteins called
replication mediator proteins (
RMPs;
Beernick and Morrical, 1999). As with helicases,
SSBs have diverse roles in different processes involving DNA unwinding.
Figure 13.15
.
Priming and synthesis of the lagging-strand copy during DNA replication in Escherichia coli
Figure 13.16
.
A model for parallel synthesis of the leading- and lagging-strand copies by a dimer of DNA polymerase III enzymes
It is thought that the lagging strand loops through its copy of the DNA polymerase III enzyme, in the manner shown, so that both the leading and lagging strands can be copied as the dimer moves along the molecule being replicated. The two components of the DNA polymerase III dimer are not identical because there is only one copy of the γ complex.
After 1000–2000 nucleotides of the leading strand have been replicated, the first round of discontinuous strand synthesis on the lagging strand can begin. The primase, which is still associated with the DnaB helicase in the primosome, makes an RNA primer which is then extended by
DNA polymerase III (). This is the same
DNA polymerase III complex that is synthesizing the leading-strand copy, the complex comprising, in effect, two copies of the polymerase. It is not two complete copies because there is only a single
γ complex, containing subunit γ in association with δ, δ′, χ and ψ. The main function of the γ complex is to interact with the β subunit (the ‘sliding clamp’) and hence control the attachment and removal of the enzyme from the template, a function that is required primarily during lagging-strand replication when the enzyme has to attach and detach repeatedly at the start and end of each
Okazaki fragment. Some models of the
DNA polymerase III complex place the two enzymes in opposite orientations to reflect the different directions in which DNA synthesis occurs, towards the replication fork on the leading strand and away from it on the lagging strand. It is more likely, however, that the pair of enzymes face the same direction and the lagging strand forms a loop, so that DNA synthesis can proceed in parallel as the polymerase complex moves forwards in pace with the progress of the replication fork ().
The combination of the
DNA polymerase III dimer and the primosome, migrating along the parent DNA and carrying out most of the replicative functions, is called the
replisome. After its passage, the replication process must be completed by joining up the individual
Okazaki fragments. This is not a trivial event because one member of each pair of adjacent
Okazaki fragments still has its RNA primer attached at the point where ligation should take place ().
Table 13.2 shows us that this primer cannot be removed by
DNA polymerase III, because this enzyme lacks the required 5′→3′ exonuclease activity. At this point,
DNA polymerase III releases the lagging strand and its place is taken by
DNA polymerase I, which does have a 5′→3′ exonuclease and so removes the primer, and usually the start of the DNA component of the
Okazaki fragment as well, extending the 3′ end of the adjacent fragment into the region of the template that is exposed. The two
Okazaki fragments now abut, with the terminal regions of both composed entirely of DNA. All that remains is for the missing phosphodiester bond to be put in place by a
DNA ligase, linking the two fragments and completing replication of this region of the lagging strand.
The eukaryotic replication fork: variations on the bacterial theme
The elongation phase of genome replication is similar in bacteria and eukaryotes, although the details differ. The progress of the replication fork in eukaryotes is maintained by helicase activity, although which of the several eukaryotic helicases that have been identified are primarily responsible for DNA unwinding during replication has not been established. The separated polynucleotides are prevented from reattaching by single-strand binding proteins, the main one of these in eukaryotes being RPA.
We begin to encounter unique features of the eukaryotic replication process when we examine the method used to prime DNA synthesis. As described on page 397, the eukaryotic
DNA polymerase α cooperates with the primase enzyme to put in place the RNA-DNA primers at the start of the leading-strand copy and at the beginning of each
Okazaki fragment. However,
DNA polymerase α is not capable of lengthy DNA synthesis, presumably because it lacks the stabilizing effect of a sliding clamp equivalent to the β subunit of
E.
coli DNA polymerase III or the
PCNA accessory protein that aids the eukaryotic
DNA polymerase δ. This means that although
DNA polymerase α can extend the initial RNA primer with about 20 nucleotides of DNA, it must then be replaced by the main replicative enzyme,
DNA polymerase δ (see ).
The DNA polymerase enzymes that copy the leading and lagging strands in eukaryotes do not associate into a dimeric complex equivalent to the one formed by DNA polymerase III during replication in E.
coli. Instead, the two copies of the polymerase remain separate. The function performed by the γ complex of the E.
coli polymerase - controlling attachment and detachment of the enzyme from the lagging strand - appears to be carried out by a multi-subunit accessory protein called replication factor C (RFC).
Figure 13.18
.
The ‘flap endonuclease’ FEN1 cannot initiate primer degradation because its activity is blocked by the triphosphate group present at the 5′ end of the primer
Figure 13.19
.
Two models for completion of lagging strand replication in eukaryotes
See the text for details. The new DNA (blue strand) is synthesized by DNA polymerase δ but this enzyme is not shown in order to increase the clarity of the diagrams.
As in
E.
coli, completion of lagging-strand synthesis requires removal of the RNA primer from each
Okazaki fragment. There appears to be no eukaryotic
DNA polymerase with the 5′→3′ exonuclease needed for this purpose and the process is therefore very different to that described for bacterial cells. The central player is the ‘flap endonuclease’,
FEN1 (previously called MF1), which associates with the
DNA polymerase δ complex at the 3′ end of an
Okazaki fragment, in order to degrade the primer from the 5′ end of the adjacent fragment. Understanding exactly how this occurs is complicated by the inability of
FEN1 to initiate primer degradation because it is unable to remove the ribonucleotide at the extreme 5′ end of the primer, because this ribonucleotide carries a 5′-triphosphate group which blocks
FEN1 activity (). Two alternative models to circumvent this problem have been proposed (
Waga and Stillman, 1998):
-
The first possibility is that a helicase breaks the base pairs holding the primer to the template strand, enabling the primer to be pushed aside by DNA polymerase δ as it extends the adjacent Okazaki fragment into the region thus exposed (). The flap that results can be cut off by FEN1, whose endonuclease activity can cleave the phosphodiester bond at the branch point where the displaced region attaches to the part of the fragment that is still base-paired. -
Alternatively, most of the RNA component of the primer could be removed by RNase H, which can degrade the RNA part of a base-paired RNA-DNA hybrid, but cannot cleave the phosphodiester bond between the last ribonucleotide and the first deoxyribonucleotide. However, this ribonucleotide will carry a 5′-monophosphate rather than triphosphate and so can be removed by FEN1 ().
Both schemes are made attractive by the possibility that both the RNA primer and all of the DNA originally synthesized by
DNA polymerase α are removed. This is because
DNA polymerase α has no 3′→5′ proofreading activity (see
Table 13.2) and therefore synthesizes DNA in a relatively error-prone manner. To prevent these errors from becoming permanent features of the daughter double helix, this region of DNA might be degraded and resynthesized by
DNA polymerase δ, which does have a proofreading activity and so makes a highly accurate copy of the template. At present this possibility remains speculative.
Figure 13.20
.
Replication factories in a eukaryotic nucleus
Equivalent transcription factories are responsible for RNA synthesis. Reproduced with permission from Nakamura H et al. (1986) Exp. Cell Res., 165, 291–297, Academic Press, Inc., Orlando, FL.
The final difference between replication in bacteria and eukaryotes is that in eukaryotes there is no equivalent of the bacterial replisome. Instead, the enzymes and proteins involved in replication form sizeable structures within the nucleus, each containing hundreds or thousands of individual replication complexes. These structures are immobile because of attachments with the nuclear matrix, so DNA molecules are threaded through the complexes as they are replicated. The structures are referred to as
replication factories () and may in fact also be features of the replication process in at least some bacteria (
Lemon and Grossman, 1998;
Cook, 1999).
13.2.3. Termination of replication
Replication forks proceed along linear genomes, or around circular ones, generally unimpeded except when a region that is being transcribed is encountered. DNA synthesis occurs at approximately five times the rate of RNA synthesis, so the replication complex can easily overtake an RNA polymerase, but this probably does not happen: instead it is thought that the replication fork pauses behind the RNA polymerase, proceeding only when the transcript has been completed (Deshpande and Newlon, 1996).
Eventually the replication fork reaches the end of the molecule or meets a second replication fork moving in the opposite direction. What happens next is one of the least understood aspects of genome replication.
Replication of the E. coli genome terminates within a defined region
Figure 13.21
.
A situation that is not allowed to occur during replication of the circular Escherichia coli genome
One of the replication forks has proceeded some distance past the halfway point. This does not happen during
DNA replication in
E.
coli because of the action of the
Tus proteins (see ).
Figure 13.22
.
The role of terminator sequences during DNA replication in Escherichia coli
(A) The positions of the six terminator sequences on the E.
coli genome are shown, with the arrowheads indicating the direction that each terminator sequence can be passed by a replication fork. (B) Bound Tus proteins allow a replication fork to pass when the fork approaches from one direction but not when it approaches from the other direction. The diagram shows a replication fork passing by the left-hand Tus, because the DnaB helicase that is moving the fork forwards can disrupt the Tus when it approaches it from this direction. The fork is then blocked by the second Tus, because this one has its impenetrable wall of β-strands facing towards the fork.
Bacterial genomes are replicated bidirectionally from a single point (see ), which means that the two replication forks should meet at a position diametrically opposite the origin of replication on the genome map. However, if one fork is delayed, possibly because it has to replicate extensive regions where transcription is occurring, then it might be possible for the other fork to overshoot the halfway point and continue replication on the ‘other side’ of the genome (). It is not immediately apparent why this should be undesirable, the daughter molecules presumably being unaffected, but it is not allowed to happen because of the presence of
terminator sequences. Seven of these have been identified in the
E. coli genome (), each one acting as the recognition site for a sequence-specific
DNA-binding protein called
Tus.
The mode of action of
Tus is quite unusual. When bound to a terminator sequence, a
Tus protein allows a replication fork to pass if the fork is moving in one direction, but blocks progress if the fork is moving in the opposite direction around the genome. The directionality is set by the orientation of the
Tus protein on the double helix. When approached from one direction,
Tus blocks the passage of the DnaB helicase, which is responsible for progression of the replication fork, because the helicase is faced with a ‘wall’ of β-strands which it is unable to penetrate. But when approaching from the other direction, DnaB is able to disrupt the structure of the
Tus protein, probably because of the effect that unwinding of the double helix has on
Tus, and so is able to pass by (;
Kamada et al., 1996).
The orientation of the termination sequences, and hence of the bound
Tus proteins, in the
E.
coli genome is such that both replication forks become trapped within a relatively short region on the opposite side of the genome to the origin (see ). This ensures that termination always occurs at or near the same position. Exactly what happens when the two replication forks meet is unknown, but the event is followed by disassembly of the replisomes, either spontaneously or in a controlled fashion. The result is two interlinked daughter molecules, which are separated by topoisomerase IV.
Little is known about termination of replication in eukaryotes
No sequences equivalent to bacterial terminators are known in eukaryotes, and proteins similar to
Tus have not been identified. Quite possibly, replication forks meet at random positions and termination simply involves ligation of the ends of the new polynucleotides. We do know that the replication complexes do not break down, because these factories are permanent features of the nucleus (see ).
Figure 13.23
.
Cohesins
Cohesin proteins attach immediately after passage of the replication fork and hold the daughter molecules together until anaphase. During anaphase, the cohesins are cleaved, enabling the replicated chromosomes to separate prior to their distribution into daughter nuclei (see Figure 5.14).
Rather than concentrating on the molecular events occurring when replication forks meet, attention has been focused on the difficult question of how the daughter DNA molecules produced in a eukaryotic nucleus do not become impossibly tangled up. Although
DNA topoisomerases have the ability to untangle DNA molecules, it is generally assumed that tangling is kept to a minimum so that extensive breakage-and-reunion reactions, as catalyzed by topoisomerases (see ), can be avoided. Various models have been proposed to solve this problem (
Falaschi, 2000). One of these (
Cook, 1998;
Wei et al., 1998) suggests that a eukaryotic genome is not randomly packed into the nucleus, but is ordered around the replication factories, which appear to be present in only limited numbers. It is envisaged that each factory replicates a single region of the DNA, maintaining the daughter molecules in a specific arrangement that avoids their entanglement. Initially, the two daughter molecules are held together by
cohesin proteins, which are attached immediately after passage of the replication fork by a process that appears to involve
DNA polymerase κ (
Takahashi and Yanagida, 2000), an enigmatic enzyme that is essential for replication but whose only known role does not obviously require a
DNA polymerase activity. The cohesins maintain the alignment of the sister chromatids until the anaphase stage of nuclear division, when they are cleaved by cutting proteins, enabling the daughter chromosomes to separate (;
Murray, 1999).
13.2.4. Maintaining the ends of a linear DNA molecule
There is one final problem that we must consider before leaving the replication process. This concerns the steps that have to be taken to prevent the ends of a linear double-stranded molecule from gradually getting shorter during successive rounds of chromosomal
DNA replication. There are two ways in which this shortening might occur:
-
The extreme 3′ end of the lagging strand might not be copied because the final Okazaki fragment cannot be primed, the natural position for the priming site being beyond the end of the template (). The absence of this Okazaki fragment means that the lagging-strand copy is shorter than it should be. If the copy remains this length then when it acts as a parental polynucleotide in the next round of replication the resulting daughter molecule will be shorter than its grandparent. -
If the primer for the last Okazaki fragment is placed at the extreme 3′ end of the lagging strand, then shortening will still occur, although to a lesser extent, because this terminal RNA primer cannot be converted into DNA by the standard processes for primer removal (). This is because the methods for primer removal (as illustrated in for bacteria and for eukaryotes) require extension of the 3′ end of an adjacent Okazaki fragment, which cannot occur at the very end of the molecule.
Once this problem had been recognized, attention was directed at the telomeres, the unusual DNA sequences at the ends of eukaryotic chromosomes. We noted in Section 2.2.1 that telomeric DNA is made up of a type of minisatellite sequence, being comprised of multiple copies of a short repeat motif, 5′-TTAGGG-3′ in most higher eukaryotes, a few hundred copies of this sequence occurring in tandem repeats at each end of every chromosome. The solution to the end-shortening problem lies with the way in which this telomeric DNA is synthesized.
Telomeric DNA is synthesized by the telomerase enzyme
Figure 13.25
.
Extension of the end of a human chromosome by telomerase
The 3′ end of a human chromosomal DNA molecule is shown. The sequence comprises repeats of the human telomere motif 5′-TTAGGG-3′. The telomerase RNA base-pairs to the end of the DNA molecule which is extended a short distance, the length of this extension possibly determined by the presence of a stem-loop structure in the telomerase RNA (Tzfati et al., 2000). The telomerase RNA then translocates to a new base-pairing position slightly further along the DNA polynucleotide and the molecule is extended by a few more nucleotides. The process can be repeated until the chromosome end has been extended by a sufficient amount.
Table 13.3
Sequences of telomere repeats and telomerase RNAs in various organisms
| Human | 5′-TTAGGG-3′ | 5′-CUAACCCUAAC-3′ |
| Oxytricha | 5′-TTTTGGGG-3′ | 5′-CAAAACCCCAAAACC-3′ |
| Tetrahymena | 5′-TTGGGG-3′ | 5′-CAACCCCAA-3′ |
Most of the telomeric DNA is copied in the normal fashion during
DNA replication but this is not the only way in which it can be synthesized. To compensate for the limitations of the replication process, telomeres can be extended by an independent mechanism catalyzed by the enzyme
telomerase. This is an unusual enzyme in that it consists of both protein and RNA. In the human enzyme the RNA component is 450 nucleotides in length and contains near its 5′ end the sequence 5′-CUAACCCUAAC-3′, whose central region is the reverse complement of the human telomere repeat sequence 5′-TTAGGG-3′ (
Feng et al., 1995). This enables telomerase to extend the telomeric DNA at the 3′ end of a polynucleotide by the copying mechanism shown in , in which the telomerase RNA is used as a template for each extension step, the DNA synthesis being carried out by the protein component of the enzyme, which is a reverse transcriptase (
Lingner et al., 1997). The correctness of this model is indicated by comparisons between telomere repeat sequences and the telomerase RNAs of other species (
Table 13.3): in all organisms that have been looked at, the telomerase RNA contains a sequence that enables it to make copies of the repeat motif present at the organism's telomeres. An interesting feature is that in all organisms the strand synthesized by telomerase has a preponderance of G nucleotides; it is therefore referred to as the G-rich strand.
Figure 13.26
.
Completion of the extension process at the end of a chromosome
It is believed that after telomerase has extended the 3′ end by a sufficient amount, as shown in , a new
Okazaki fragment is primed and synthesized, converting the 3′ extension into a completely double-stranded end.
Telomerase can only synthesize this G-rich strand. It is not clear how the other polynucleotide - the C-rich strand - is extended, but it is presumed that when the G-rich strand is long enough, the primase-
DNA polymerase α complex attaches at its end and initiates synthesis of complementary DNA in the normal way (). This requires the use of a new RNA primer, so the C-rich strand will still be shorter than the G-rich one, but the important point is that the overall length of the chromosomal DNA has not been reduced.
Telomere length is implicated in senescence and cancer
Perhaps surprisingly, telomerase is not active in all mammalian cells. The enzyme is functional in the early embryo, but after birth is active only in the reproductive and stem cells. The latter are progenitor cells that divide continually throughout the lifetime of an organism, producing new cells to maintain organs and tissues in a functioning state. The best-studied examples are the hemopoietic stem cells of the bone marrow, which generate new blood cells.
Cells that lack telomerase activity undergo chromosome shortening every time they divide. Eventually, after many cell divisions, the chromosome ends could become so truncated that essential genes are lost, but this is unlikely to be a major cause of the defects that can occur in cells lacking telomerase activity. Instead, the critical factor is the need to maintain a protein ‘cap’ on each chromosome end, to protect these ends from the effects of the DNA repair enzymes that join together the uncapped ends that are produced by accidental breakage of a chromosome (Section 2.2.1). The proteins that form this protective cap, such as TRF2 in humans, recognize the telomere repeats as their binding sequences, and so have no attachment points after the telomeres have been deleted. If these proteins are absent then the repair enzymes can make inappropriate linkages between the ends of intact, although shortened, chromosomes; it is this that is probably the underlying cause of the disruption to the cell cycle that results from telomere shortening.
Figure 13.27
.
Cultured cells become senescent after multiple cell divisions
Telomere shortening will therefore lead to the termination of a cell lineage. For several years biologists have attempted to link this process with
cell senescence, a phenomenon originally observed in cell cultures. All normal cell cultures have a limited lifetime: after a certain number of divisions the cells enter a senescent state in which they remain alive but cannot divide (). With some mammalian cell lines, notably fibroblast cultures (connective tissue cells), senescence can be delayed by engineering the cells so that they synthesize active telomerase (
Reddel, 1998). These experiments suggest a clear relationship between telomere shortening and senescence, but the exactness of the link has been questioned (
Blackburn, 2000), and any extrapolation from cell senescence to aging of the organism is fraught with difficulties (
Kipling and Faragher, 1999).
Not all cell lines display senescence. Cancerous cells are able to divide continuously in culture, their immortality being looked upon as analogous to tumor growth in an intact organism. With several types of cancer, this absence of senescence is associated with activation of telomerase, sometimes to the extent that telomere length is maintained through multiple cell divisions, but often in such a way that the telomeres become longer than normal because the telomerase is overactive. It is not clear if telomerase activation is a cause or an effect of cancer, although the former seems more likely because at least one type of cancer, dyskeratosis congenita, appears to result from a mutation in the gene specifying the RNA component of human telomerase (Marciniak and Guarente, 2001). The question is critical to understanding the etiology of the cancer but is less relevant to the therapeutic issue, which centers on whether telomerase could be a target for drugs designed to combat the cancer. Such a therapy could be successful even if telomerase activation is an effect of the cancer, because inactivation by drugs would induce senescence of the cancer cells and hence prevent their proliferation.