The ordered assembly of deoxyribonucleotides into DNA and of ribonucleotides into RNA
involves somewhat simpler cellular mechanisms than the correct assembly of the amino
acids in a protein chain. Here we consider a few general principles governing the
formation of polynucleotide chains in cells and briefly discuss some properties of
the enzymes that carry out such synthesis. We also describe the steps in the
production of mRNA and examine how and why this process differs in bacteria and
eukaryotes. Later chapters cover the mechanism of DNA replication and its control
during cell growth and division, and the mechanism and the control of the synthesis
of specific mRNAs during differentiation (Chapters 10 and 12).
Both DNA and RNA Chains Are Produced by Copying of Template DNA
Strands
The regular pairing of bases in the double-helical DNA structure suggested to
Watson and Crick a mechanism of DNA synthesis. Their proposal that new strands
of DNA are synthesized by copying of parental strands of DNA has proved to be
correct.
The DNA strand that is copied to form a new strand is called a template. The information in the
template is preserved: although the first copy has a complementary sequence, not
an identical one, a copy of the copy produces the original (template) sequence
again. In the replication of a double-stranded, or duplex, DNA
molecule, both original (parental) DNA strands are copied. When copying is
finished, the two new duplexes, each consisting of one of the two original
strands plus its copy, separate from each other. In some viruses,
single-stranded RNA molecules function as templates for synthesis of
complementary RNA or DNA chains (Chapter
7). However, the vast majority of RNA and DNA in cells is synthesized
from preexisting duplex DNA.
Nucleic Acid Strands Grow in the 5′ → 3′
Direction
All RNA and DNA synthesis, both cellular and viral, proceeds in the same chemical
direction: from the 5′ (phosphate) end to the 3′ (hydroxyl)
end (see Figure 4-13). Nucleic acid
chains are assembled from 5′ triphosphates of ribonucleosides or
deoxyribonucleosides. Strand growth is energetically unfavorable but is driven
by the energy available in the triphosphates. The α phosphate of the
incoming nucleotide attaches to the 3′ hydroxyl of the ribose (or
deoxyribose) of the preceding residue to form a phosphodiester bond, releasing a
pyrophosphate (PPi). The equilibrium of the reaction is driven
further toward chain elongation by pyrophosphatase, which catalyzes the cleavage
of PPi into two molecules of inorganic phosphate (see Table 2-7).
RNA Polymerases Can Initiate Strand Growth but DNA Polymerases Cannot
The enzymes that copy (replicate) DNA to make more DNA are DNA polymerases; those that copy
(transcribe) DNA to form RNA are RNA
polymerases. Because the two DNA strands are complementary, rather
than identical, transcription of a particular DNA segment theoretically could
yield two mRNAs with different sequences and hence different protein-coding
potentials. Generally, only one strand of the duplex in a particular DNA segment
gives rise to usable information when transcribed into mRNA. In unusual cases,
though, limited sections of DNA encode proteins on both strands.
Figure 4-15
.
Transcription of DNA into RNA is catalyzed by RNA polymerase,
which can initiate the synthesis of strands de novo on DNA
templates
The nucleotide at the 5′ end of an RNA strand retains
all three of its phosphate groups; all subsequent nucleotides
release pyrophosphate (PPi) when added to the chain
and retain only their α phosphate (red). The released
PPi is subsequently hydrolyzed by pyrophosphatase
to Pi, driving the equilibrium of the overall
reaction toward chain elongation. In most cases, only one DNA
strand is transcribed into RNA.
View Movie: Basic Transcriptional Mechanism
An
RNA polymerase can find an appropriate initiation site on duplex DNA; bind the
DNA; temporarily “melt,” or separate, the two strands in
that region; and begin generating a new RNA strand (). As discussed in
Chapter 10, the location and regulated use of
transcription start sites to produce mRNA requires many dozens of
proteins in
eukaryotes and several
proteins even in bacteria. The
nucleotide at the 5
′ terminus of a growing RNA strand is chemically distinct from the
nucleotides within the strand in that it retains all three phosphate groups.
When an additional
nucleotide is added to the 3′ end of the growing
strand, only the α-phosphate is retained; the β and
γ phosphates are lost as pyrophosphate, which is subsequently
hydrolyzed to yield 2 molecules of inorganic phosphate.
Unlike RNA polymerases, DNA polymerases cannot initiate chain synthesis de novo;
instead, they require a short, preexisting RNA or DNA strand, called a primer,
to begin chain growth. With a primer base-paired to the template strand, a DNA
polymerase adds nucleotides to the free hydroxyl group at the 3′ end
of the primer:
If RNA is the primer, the polynucleotide copied from the template is RNA at the
5′ end and DNA at the 3′ end.
Both prokaryotic and eukaryotic cells have several different types of DNA
polymerases. Some polymerases participate in making new DNA to prepare for cell
division; other polymerases serve in the repair and recombination of DNA
molecules. The structure, mechanism, and physiological role of these enzymes are
described in Chapter 12.
Replication of Duplex DNA Requires Assembly of Many Proteins at a Growing
Fork
Because duplex DNA consists of two intertwined strands, the base-pair copying of
each strand requires unwinding of the original duplex, which is accomplished by
specific “unwinding proteins” called helicases. As noted earlier, local
unwinding of duplex DNA produces torsional stress, leading to formation of
supercoils, which are removed by topoisomerases. The action of all these
proteins produces a moving, highly specialized region of the DNA called the
growing fork, at which DNA
polymerase carries out nucleotide addition. In order for DNA polymerase to move
along and copy a duplex DNA, helicase must sequentially unwind the duplex and
topoisomerase must remove the supercoils that form.
Figure 4-16
.
Schematic diagram of DNA replication at a growing fork
Nucleotides are added by DNA polymerase to each daughter strand in
the 5′ → 3′ direction (indicated by
arrowheads). Synthesis of the leading strand occurs continuously
from a single RNA primer at its 5′ end (not shown).
Synthesis of the other new
strand — the lagging
strand — proceeds discontinuously,
initially forming Okazaki fragments, from multiple RNA primers that
are formed on the parental strand as each new region of DNA is
exposed at the growing fork. The RNA primers are elongated by DNA
polymerase. As each growing fragment approaches the previous primer,
that primer is removed by another enzyme and the fragments are
joined by DNA ligase to form a continuous DNA strand. By repetition
of this process, the entire lagging strand eventually is
completed.
DNA replication begins with creation of a
growing fork by a
protein or
proteins
that have
helicase activity and unwind a short section of parental DNA. A
specialized
RNA polymerase then forms short RNA
primers complementary to the
unwound
template strands. Each such
primer, still bound to its
complementary DNA
strand, is then elongated by
DNA polymerase, thereby forming a new daughter
strand. One final major complication in the operation of a DNA
growing fork is
that although the two strands of the parental duplex are antiparallel,
nucleo-tides can be added to the growing new strands only in the 5′
→ 3′ direction. As diagrammed in , synthesis of one daughter strand, called the
leading strand, proceeds
continuously from a single RNA
primer in the 5′ →
3′ direction,
the same direction as movement of the growing
fork. Because growth of the other daughter strand, called the
lagging strand, also must occur in
the 5′ → 3′ direction, copying of its
template
strand must somehow occur in the
opposite direction from the
movement of the
growing fork. A cell accomplishes this feat by producing
additional short RNA
primers every 1000
bases or so on the second parental
strand, as more of the strand is exposed by unwinding. Each of these
primers,
base-paired to their
template strand, is elongated in the 5′
→ 3′ direction, forming discontinuous segments called
Okazaki fragments after their
discoverer Reiji Okazaki. The RNA
primer of each Okazaki fragment is removed and
replaced by DNA chain growth from the neighboring Okazaki fragment; finally an
enzyme called
DNA ligase joins the adjacent fragments. At least
30
proteins participate in the formation and operation of a DNA
growing fork;
this DNA-replication machine is discussed in detail in
Chapter 12.
Organization of Genes in DNA Differs in Prokaryotes and Eukaryotes
Having outlined the principles governing the stepwise assembly of
polynucleotides, we now focus briefly on the large-scale arrangement of
information in DNA and how this arrangement dictates the requirements for RNA
manufacture so that information transfer goes smoothly. The simplest definition
of a gene is a “unit of DNA that contains the information to specify
synthesis of a single polypeptide chain.” The number of genes in cells
varies widely, with the simpler non-nucleated prokaryotic cells having far fewer
genes than eukaryotic cells. The vast majority of genes carry information to
build protein molecules, and it is the RNA copies of such protein-coding
genes that are the mRNA molecules of cells. In recent years, the
entire sequence of the DNA genome of several organisms has been determined,
providing direct evidence for large differences in their protein-coding capacity
(Chapter 7).
Figure 4-17
.
Comparison of gene organization, transcription, and translation
in prokaryotes and eukaryotes
(a) The tryptophan
(trp) operon is a continuous
segment of the
E. coli chromosome, containing five
genes (blue) that encode the
enzymes necessary for the stepwise
synthesis of tryptophan. The entire
operon is transcribed from one
start site (blue arrow) into one long continuous
trp mRNA (red).
Translation of this mRNA begins
at five different
start sites, yielding five
proteins (green).
Proteins E and D associate to form the first
enzyme in the
tryptophan biosynthetic pathway;
protein C catalyzes the
intermediate step; and
proteins A and B form tryptophan synthetase,
the final
enzyme. Thus the order of the
genes in the bacterial
genome parallels the sequential function of the encoded
proteins in
the tryptophan pathway. (b) The five
genes encoding the
enzymes
required for tryptophan synthesis in yeast
(Saccharomyces
cerevisiae) are carried on four different
chromosomes.
Each
gene is transcribed from its own
start site to yield a primary
transcript that is processed into a functional mRNA encoding a
single
protein (see ). The length of the yeast
chromosomes is given in
kilobases (10
3 bases), with all drawn to the same
length.
The most common arrangement of
protein-coding
genes in all
prokaryotes has a
powerful and appealing logic:
genes devoted to a single metabolic goal, say, the
synthesis of the
amino acid tryptophan, are most often found in a contiguous
array in the DNA. This
gene order makes it possible to produce a continuous
strand of mRNA that carries the message for a related series of
enzymes devoted
to making tryptophan (). Each
section of the mRNA represents the unit (or
gene) that instructs the
protein-synthesizing apparatus to make a particular
protein. Such an arrangement
of
genes in a functional group is called an
operon, because it operates as a unit from a single
transcription start site. In prokaryotic DNA the
genes are closely packed with
very few noncoding gaps, and the DNA is transcribed directly into colinear mRNA,
which then is translated into
protein, even while stretches of the mRNA closer
to the 3′ end are still being produced.
This economic clustering of
genes devoted to a single metabolic function does not
occur in
eukaryotes, even simple ones like yeasts that can be metabolically
similar to bacteria. Rather, eukaryotic
genes, even those devoted to a single
pathway, are most often physically separated in the DNA, sometimes even being
located on different
chromosomes. Each
gene is transcribed from its own
start
site, producing one mRNA, which generally is translated to yield a single
protein (). Moreover, when
researchers first compared the
nucleotide sequences of eukaryotic mRNAs with the
DNAs encoding them, they were astounded to find that the uninterrupted
protein-coding sequence of a given mRNA was broken up (discontinuous) in its
corresponding section of DNA. They concluded that the eukaryotic
gene existed in
pieces of coding sequence, the
exons, separated by
non-
protein-coding segments, the
introns. This astonishing
finding, first discovered in
viruses that infect eukaryotic cells, implied that
the long initial RNA copy, called the
primary transcript, the entire copied DNA sequence, had to be
clipped apart to remove the
introns and then carefully stitched back together to
produce many mRNAs of eukaryotic cells.
Eukaryotic Primary RNA Transcripts Are Processed to Form Functional
mRNAs
In prokaryotic cells, which have no nuclei, translation of an mRNA into protein
can begin from the 5′ end of the mRNA even while the 3′ end
is still being copied from DNA. Thus, transcription and translation can occur
concurrently. In eukaryotic cells, however, not only is the nucleus separated
from the cytoplasm where protein synthesis occurs, but the primary RNA
transcript of a protein-coding gene must undergo several modifications,
collectively termed RNA processing,
that yield a functional mRNA. This mRNA then must be transported to the
cytoplasm before it can be translated into protein. Thus, transcription and
translation cannot occur concurrently in eukaryotic cells.
Figure 4-18
.
Structure of the 5′ methylated cap of eukaryotic
mRNA
The distinguishing chemical features are the 5′
→ 5′ linkage of 7-methylguanylate to the
initial nucleotide of the mRNA molecule and the methyl group on the
2′ hydroxyl of the ribose of the first nucleo-tide (base
1). Both these features occur in all animal cells and in cells of
higher plants; yeasts lack the methyl group on base 1. The ribose of
the second nucleotide (base 2) also is methylated in vertebrates.
[See A. J. Shatkin, 1976, Cell
9:645.]
The initial steps in processing of all eukaryotic primary RNA
transcripts occur
at the two ends, and these modifications are retained in mRNAs. To the
initiating (5′)
nucleotide of the
primary transcript is added the
5′
cap, which may serve to
protect mRNA from enzymatic degradation (). This modification occurs before
transcription is complete, so
the 5′ cap is present in the
primary transcript. Processing at the
3′ end of the
primary transcript involves cleavage by an endonuclease
to yield a free 3′-hydroxyl group to which a string of adenylic
acid
residues is added by an
enzyme called
poly(A) polymerase. The
resulting poly(A) tail contains 100 – 250
bases,
being shorter in yeasts and invertebrates than in vertebrates. Poly(A)
polymerase is part of a complex of
proteins that adds the poly(A) tail. This
complex does not require a
template and can determine the correct number of A
residues to add in each species.
Figure 4-19
.
Overview of RNA processing in eukaryotes using β-globin
gene as an example
The β-globin gene contains three protein-coding exons (red)
and two intervening noncoding introns (blue). The introns interrupt
the protein-coding sequence between the codons for amino acids 31
and 32 and 105 and 106. Transcription of this and many other genes
starts slightly upstream of the 5′ exon and extends
downstream of the 3′ exon, resulting in noncoding regions
(gray) at the ends of the primary transcript. These regions,
referred to as untranslated regions (UTRs), are
retained during processing. The 5′ 7-methylguanylate cap
(m7Gppp; green dot) is added during formation of the
primary RNA transcript, which extends beyond the poly(A) site. After
cleavage at the poly(A) site and addition of multiple A residues to
the 3′ end, splicing removes the introns and joins the
exons. The small numbers refer to positions in the 147-aa sequence
of β-globin.
View Movie: Life Cycle of an mRNA
The final step in the processing of many different eukaryotic mRNA molecules is
splicing: the internal cleavage of the RNA
transcript to excise
the
introns, followed by ligation of the coding
exons. Many eukaryotic mRNAs
also contain noncoding regions at each end; these are referred to as the
5′ and 3′
untranslated regions (UTRs).
summarizes the basic
steps in
RNA processing. We examine the cellular machinery for carrying out
processing of mRNA, as well as tRNA and rRNA, in
Chapter 11.
SUMMARY
-
The transfer of information from genes to
proteins is assisted by proteins that participate in the synthesis of
DNA and RNA.
-
Polynucleotide and polypeptide chains are
assembled from a limited number of monomeric units that are added one at
a time, beginning at the 5′ end in nucleic acids and the
amino-terminal end in proteins (see Figure 4-13). In both cases, the initial polymeric product
generally is modified in some fashion to produce a functional
molecule.
-
A polynucleotide chain is synthesized by
copying of a complementary template strand (usually DNA). In this
process, the duplex DNA is locally unwound, revealing the unpaired
template strand, and nucleotides are added to the 3′-hydroxyl
end of the growing strand by RNA or DNA polymerase.
-
RNA polymerase can initiate transcription
of DNA into RNA by binding to a specific start site and unwinding the
duplex. As the enzyme moves along the DNA, it unwinds sequential
segments of the DNA and adds nucleotides to the growing RNA strand (see
). Most commonly,
only one DNA strand in any one locus is transcribed into RNA. -
Replication of DNA requires the assistance
of helicase to unwind the duplex, topoisomerase to remove supercoils,
and a specialized RNA polymerase to form RNA primers because DNA
polymerase cannot start chains. Nucleotide addition at the growing fork,
a moving region of strand separation produced by sequential unwinding of
the duplex, is catalyzed by one type of DNA polymerase.
-
During DNA replication, the two new
daughter strands are assembled somewhat differently because DNA
polymerase can add nucleotides only in the 5′ →
3′ direction (see ). One new strand, the leading strand, is elongated
continuously from a single primer. The other new strand, the lagging
strand, is synthesized discontinuously as a series of short segments,
called Okazaki fragments, initiated from multiple RNA primers. After
removal of the intervening primer, adjacent Okazaki fragments are joined
by DNA ligase. -
In prokaryotic DNA, related protein-coding
genes are clustered into a functional region, an operon, which is
transcribed from a single start site into one mRNA encoding multiple
proteins (see ).
Translation of a mRNA can begin before synthesis of the mRNA is
complete. -
In eukaryotic DNA, each protein-coding gene
is transcribed from its own start site, and very often the coding
regions (exons) are separated by noncoding regions (introns). The
primary RNA transcript produced from such a gene must undergo processing
to yield a functional mRNA. During processing, the ends of all primary
transcripts are modified by addition of a 5′ cap and
3′ poly(A) tail; many transcripts also undergo
splicing — removal of the introns and
joining of the exons (see ).
ǀ