NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Berg JM, Tymoczko JL, Stryer L. Biochemistry. 5th edition. New York: W H Freeman; 2002.

  • By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.
Cover of Biochemistry

Biochemistry. 5th edition.

Show details


A Nucleic Acid Consists of Four Kinds of Bases Linked to a Sugar-Phosphate Backbone

DNA and RNA are linear polymers of a limited number of monomers. In DNA, the repeating units are nucleotides, with the sugar being a deoxyribose and the bases being adenine (A), thymine (T), guanine (G), and cytosine (C). In RNA, the sugar is a ribose and the base uracil (U) is used in place of thymine. DNA is the molecule of heredity in all prokaryotic and eukaryotic organisms. In viruses, the genetic material is either DNA or RNA.

A Pair of Nucleic Acid Chains with Complementary Sequences Can Form a Double-Helical Structure

All cellular DNA consists of two very long, helical polynucleotide chains coiled around a common axis. The sugar-phosphate backbone of each strand is on the outside of the double helix, whereas the purine and pyrimidine bases are on the inside. The two chains are held together by hydrogen bonds between pairs of bases: adenine is always paired with thymine, and guanine is always paired with cytosine. Hence, one strand of a double helix is the complement of the other. The two strands of the double helix run in opposite directions. Genetic information is encoded in the precise sequence of bases along a strand. Most RNA molecules are single stranded, but many contain extensive double-helical regions that arise from the folding of the chain into hairpins.

DNA Is Replicated by Polymerases That Take Instructions from Templates

In the replication of DNA, the two strands of a double helix unwind and separate as new chains are synthesized. Each parent strand acts as a template for the formation of a new complementary strand. Thus, the replication of DNA is semiconservative—each daughter molecule receives one strand from the parent DNA molecule. The replication of DNA is a complex process carried out by many proteins, including several DNA polymerases. The activated precursors in the synthesis of DNA are the four deoxyribonucleoside 5′-triphosphates. The new strand is synthesized in the 5′ → 3′ direction by a nucleophilic attack by the 3′-hydroxyl terminus of the primer strand on the innermost phosphorus atom of the incoming deoxyribonucleoside triphosphate. Most important, DNA polymerases catalyze the formation of a phosphodiester bond only if the base on the incoming nucleotide is complementary to the base on the template strand. In other words, DNA polymerases are template-directed enzymes. The genes of some viruses, such as tobacco mosaic virus, are made of single-stranded RNA. An RNA-directed RNA polymerase mediates the replication of this viral RNA. Retroviruses, exemplified by HIV-1, have a single-stranded RNA genome that is transcribed into double-stranded DNA by reverse transcriptase, an RNA-directed DNA polymerase.

Gene Expression Is the Transformation of DNA Information into Functional Molecules

The flow of genetic information in normal cells is from DNA to RNA to protein. The synthesis of RNA from a DNA template is called transcription, whereas the synthesis of a protein from an RNA template is termed translation. Cells contain several kinds of RNA: messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA), which vary in size from 75 to more than 5000 nucleotides. All cellular RNA is synthesized by RNA polymerase according to instructions given by DNA templates. The activated intermediates are ribonucleoside triphosphates and the direction of synthesis, like that of DNA, is 5′ → 3′. RNA polymerase differs from DNA polymerase in not requiring a primer.

Amino Acids Are Encoded by Groups of Three Bases Starting from a Fixed Point

The genetic code is the relation between the sequence of bases in DNA (or its RNA transcript) and the sequence of amino acids in proteins. Amino acids are encoded by groups of three bases (called codons) starting from a fixed point. Sixty-one of the 64 codons specify particular amino acids, whereas the other 3 codons (UAA, UAG, and UGA) are signals for chain termination. Thus, for most amino acids, there is more than one code word. In other words, the code is degenerate. The genetic code is nearly the same in all organisms. Natural mRNAs contain start and stop signals for translation, just as genes do for directing where transcription begins and ends.

Most Eukaryotic Genes Are Mosaics of Introns and Exons

Most genes in higher eukaryotes are discontinuous. Coding sequences (exons) in these split genes are separated by intervening sequences (introns), which are removed in the conversion of the primary transcript into mRNA and other functional mature RNA molecules. Split genes, like continuous genes, are colinear with their polypeptide products. A striking feature of many exons is that they encode functional domains in proteins. New proteins probably arose in the course of evolution by the shuffling of exons. Introns may have been present in primordial genes but were lost in the evolution of such fast-growing organisms as bacteria and yeast.

Key Terms

deoxyribonucleic acid (DNA)



ribonucleic acid






double helix

semiconservative replication

DNA polymerase



reverse transcriptase

messenger RNA (mRNA)


transfer RNA (tRNA)

ribosomal RNA (rRNA)

small nuclear RNA (snRNA)


RNA polymerase

promoter site


genetic code






exon shuffling

alternative splicing

By agreement with the publisher, this book is accessible by the search feature, but cannot be browsed.

Copyright © 2002, W. H. Freeman and Company.
Bookshelf ID: NBK22486