Figure 4.1
.Examples of the manipulations that can be carried out with DNA molecules
The toolkit of techniques used by molecular biologists to study DNA molecules was assembled during the 1970s and 1980s. Before then, the only way in which individual genes could be studied was by classical genetics, using the procedures that we will examine in Chapter 5. Classical genetics is a powerful approach to gene analysis and many of the fundamental discoveries in molecular biology were made in this way. The operon theory proposed by Jacob and Monod in 1961 (Section 9.3.1), which describes how the expression of some bacterial genes is regulated, was perhaps the most heroic achievement of this era of genetics. But the classical approach is limited because it does not involve the direct examination of genes, information on gene structure and activity being inferred from the biological characteristics of the organism being studied. By the late 1960s these indirect methods had become insufficient for answering the more detailed questions that molecular biologists had begun to ask about the expression pathways of individual genes. These questions could only be addressed by examining directly the segments of DNA containing the genes of interest. This was not possible using the current technology, so a new set of techniques had to be invented.
In this example, the fragment of DNA to be cloned is inserted into a plasmid vector which is subsequently replicated inside a bacterial host.
In this example, a single gene is copied.
Recombinant DNA technology was one of the main factors that contributed to the rapid advance in knowledge concerning gene expression that occurred during the 1970s and 1980s. The basis of recombinant DNA technology is the ability to manipulate DNA molecules in the test tube. This, in turn, depends on the availability of purified enzymes whose activities are known and can be controlled, and which can therefore be used to make specified changes to the DNA molecules that are being manipulated. The enzymes available to the molecular biologist fall into four broad categories:
In (A), the activity of a DNA-dependent DNA polymerase is shown on the left and that of an RNA-dependent DNA polymerase on the right. In (B), the activities of endonucleases and exonucleases are shown. In (C) the red DNA molecule is ligated to itself on the left, and to a second molecule on the right.
End-modification enzymes (Section 4.1.4), which make changes to the ends of DNA molecules, adding an important dimension to the design of ligation experiments, and providing one means of labeling DNA molecules with radioactive and other markers (Technical Note 4.1).
New nucleotides are added on to the 3′ end of the growing polynucleotide, the sequence of this new polynucleotide being determined by the sequence of the template DNA.
(A) A DNA polymerase requires a primer in order to initiate the synthesis of a new polynucleotide. (B) The sequence of this oligonucleotide determines the position at which it attaches to the template DNA and hence specifies the region of the template that will be copied. When a DNA polymerase is used to make new DNA in vitro, the primer is usually a short oligonucleotide made by chemical synthesis. For details of how DNA synthesis is primed in vivo, see Figure 13.12).
All DNA polymerases can make DNA and many also have one or both of the exonuclease activities.
A 3′→5′ exonuclease activity enables the enzyme to remove nucleotides from the 3′ end of the strand that it has just synthesized. This is called the proofreading activity because it allows the polymerase to correct errors by removing a nucleotide that has been inserted incorrectly.
A 5′→3′ exonuclease activity is less common, but is possessed by some DNA polymerases whose natural function in genome replication requires that they must be able to remove at least part of a polynucleotide that is already attached to the template strand that the polymerase is copying.
| Polymerase | Description | Main use | Cross reference |
|---|---|---|---|
| DNA polymerase I | Unmodified E. coli enzyme | DNA labeling | Technical Note 4.1 |
| Klenow polymerase | Modified version of E. coli DNA polymerase I | DNA labeling | Technical Note 4.1 |
| Sequenase | Modified version of phage T7 DNA polymerase I | DNA sequencing | Box 6.1 |
| Taq polymerase | Thermus aquaticus DNA polymerase I | PCR | Section 4.3 |
| Reverse transcriptase | RNA-dependent DNA polymerase, obtained from various retroviruses | cDNA synthesis | Section 5.3.3 |
Of the two exonuclease activities, it is the 5′→3′ version that causes most problems when a DNA polymerase is used to manipulate molecules in the test tube. This is because an enzyme that possesses this activity is able to remove nucleotides from the 5′ ends of polynucleotides that have just been synthesized (Figure 4.8
The E. coli DNA polymerase I enzyme has an optimum reaction temperature of 37 °C, this being the usual temperature of the natural environment of the bacterium, inside the intestines of mammals such as humans. Test-tube reactions with either the Kornberg or Klenow polymerases, and with Sequenase, are therefore incubated at 37 °C, and terminated by raising the temperature to 75 °C or above, which causes the protein to unfold or denature, destroying its activity. This regimen is perfectly adequate for most molecular biology techniques but, for reasons that will become clear in Section 4.3, PCR requires a thermostable DNA polymerase - one that is able to function at temperatures much higher than 37 °C. Suitable enzymes can be obtained from bacteria such as Thermus aquaticus, which live in hot springs at temperatures up to 95 °C, and whose DNA polymerase I enzyme has an optimum working temperature of 72 °C. The biochemical basis of protein thermostability is not fully understood, but probably centers on structural features that reduce the amount of protein unfolding that occurs at elevated temperatures.
One additional type of DNA polymerase is important in molecular biology research. This is reverse transcriptase, which is an RNA-dependent DNA polymerase and so makes DNA copies of RNA rather than DNA templates. Reverse transcriptases are involved in the replication cycles of retroviruses (Section 2.4.2), including the human immunodeficiency viruses that cause AIDS, these having RNA genomes that are copied into DNA after infection of the host. In the test tube, a reverse transcriptase can be used to make DNA copies of mRNA molecules. These copies are called complementary DNAs (cDNAs). Their synthesis is important in some types of gene cloning and in techniques used to map the regions of a genome that specify particular mRNAs (Section 7.1.2).
| Nuclease | Description | Main use | Cross reference |
|---|---|---|---|
| Restriction endonucleases | Sequence-specific DNA endonucleases, from many sources | Many applications | Section 4.1.2 |
| S1 nuclease | Endonuclease specific for single-stranded DNA and RNA, from the fungus Aspergillus oryzae | Transcript mapping | Section 7.1.2 |
| Deoxyribonuclease I | Endonuclease specific for double-stranded DNA and RNA, from Escherichia coli | Nuclease footprinting | Section 9.1.1 |
A restriction endonuclease is an enzyme that binds to a DNA molecule at a specific sequence and makes a double-stranded cut at or near that sequence. Because of the sequence specificity, the positions of cuts within a DNA molecule can be predicted, assuming that the DNA sequence is known, enabling defined segments to be excised from a larger molecule. This ability underlies gene cloning and all other aspects of recombinant DNA technology in which DNA fragments of known sequence are required.
In the top part of the diagram, the DNA is cut by a Type I or Type III restriction endonuclease. The cuts are made in slightly different positions relative to the recognition sequence, so the resulting fragments have different lengths. In the lower part, a Type II enzyme is used. Each molecule is cut at exactly the same position to give exactly the same pair of fragments.
| Enzyme | Recognition sequence | Type of ends | End sequences |
|---|---|---|---|
| AluI | 5′-AGCT-3′ | Blunt | 5′-AG CT-3′ |
| 3′-TCGA-5′ | 3′-TC GA-5′ | ||
| Sau3AI | 5′-GATC-3′ | Sticky, 5′ overhang | 5′- GATC-3′ |
| 3′-CTAG-5′ | 3′-CTAG -5′ | ||
| HinfI | 5′-GANTC-3′ | Sticky, 5′ overhang | 5′-G ANTC-3′ |
| 3′-CTNAG-5′ | 3′-CTNA G-5′ | ||
| BamHI | 5′-GGATCC-3′ | Sticky, 5′ overhang | 5′-G GATCC-3′ |
| 3′-CCTAGG-5′ | 3′-CCTAG G-5′ | ||
| BsrBI | 5′-CCGCTC-3′ | Blunt | 5′- NNNCCGCTC-3′ |
| 3′-GGCGAG-5′ | 3′- NNNGGCGAG-5′ | ||
| EcoRI | 5′-GAATTC-3′ | Sticky, 5′ overhang | 5′-G AATTC-3′ |
| 3′-CTTAAG-5′ | 3′-CTTAA G-5′ | ||
| PstI | 5′-CTGCAG-3′ | Sticky, 3′ overhang | 5′-CTGCA G-3′ |
| 3′-GACGTC-5′ | 3′-G ACGTC-5′ | ||
| NotI | 5′-GCGGCCGC-3′ | Sticky, 5′ overhang | 5′-GC GGCCGC-3′ |
| 3′-CGCCGGCG-5′ | 3′-CGCCGG CG-5′ | ||
| BglI | 5′-GCCNNNNNGGC-3′ | Sticky, 3′ overhang | 5′-GCCNNNN NGGC-3′ |
| 3′-CGGNNNNNCCG-5′ | 3′-CGGN NNNNCCG-5′ |
N = any nucleotide.
Note that most, but not all, recognition sequences have inverted symmetry: when read in the 5′→3′ direction, the sequence is the same in both strands.
(A) Blunt ends and sticky ends. (B) Different types of sticky end: the 5′ overhangs produced by BamHI and the 3′ overhangs produced by PstI. (C) The same sticky ends produced by two different restriction endonucleases: a 5′ overhang with the sequence 5′-GATC-3′ is produced by both BamHI (recognizes 5′-GGATCC-3′) and Sau3AI (recognizes 5′-GATC-3′).
The range of fragment sizes that can be resolved depends on the concentration of agarose in the gel. Electrophoresis has been performed with three different concentrations of agarose. The labels indicate the sizes of bands in the left and right lanes. Photo courtesy of BioWhittaker Molecular Applications.
(A) Transfer of DNA from the gel to the membrane. (B) The membrane is probed with a radioactively labeled DNA molecule. On the resulting autoradiograph, one hybridizing band is seen in lane 2, and two in lane 3.
DNA fragments that have been generated by treatment with a restriction endonuclease can be joined back together again, or attached to a new partner, by a DNA ligase. The reaction requires energy, which is provided by adding either ATP or NAD to the reaction mixture, depending on the type of ligase that is being used.
(A) In living cells, DNA ligase synthesizes a missing phosphodiester bond in one strand of a double-stranded DNA molecule. (B) To link two DNA molecules in vitro, DNA ligase must make two phosphodiester bonds, one in each strand. (C) Ligation in vitro is more efficient when the molecules have compatible sticky ends, because transient base-pairing between these ends holds the molecules together and so increases the opportunity for DNA ligase to attach and synthesize the new phosphodiester bonds. For the role of DNA ligase during DNA replication in vivo, see Figures 13.17 and 13.19.
In this example, each linker contains the recognition sequence for the BamHI restriction endonuclease. DNA ligase attaches the linkers to the ends of the blunt-ended molecule in a reaction that is made relatively efficient because the linkers are present at a high concentration. The restriction enzyme is then added to cleave the linkers and produce the sticky ends. Note that during the ligation the linkers ligate to one another, so a series of linkers (a concatamer) is attached to each end of the blunt molecule. When the restriction enzyme is added, these linker concatamers are cut into segments, with half of the innermost linker left attached to the DNA molecule. Adaptors are similar to linkers but each one has one blunt end and one sticky end. The blunt-ended DNA is therefore given sticky ends simply by ligating it to the adaptors; there is no need to carry out the restriction step. For more details, see Brown (2001).
In this example, a poly(G) tail is synthesized at each end of a blunt-ended DNA molecule. Tails comprising other nucleotides are synthesized by including the appropriate dNTP in the reaction mixture.
See the text for details.
Plasmids are uncommon in eukaryotes, although Saccharomyces cerevisiae possesses one that is sometimes used for cloning purposes; most eukaryotic vectors are therefore based on virus genomes. Alternatively, with a eukaryotic host the replication requirement can be bypassed by performing the experiment in such a way that the DNA to be cloned becomes inserted into one of the host chromosomes. These approaches to cloning in eukaryotic cells are described later in the chapter.
The easiest way to understand how a cloning vector is used is to start with the simplest E. coli plasmid vectors, which illustrate all of the basic principles of DNA cloning. We will then be able to turn our attention to the special features of phage vectors and vectors used with eukaryotes.
The map shows the positions of the ampicillin-resistance gene (amp R), the tetracycline-resistance gene (tet R), the origin of replication (ori) and the recognition sequences for seven restriction endonucleases.
The manipulations shown in Figure 4.16
See text for details. The inset shows how replica plating is performed.
See the text for details.
Compare with the lytic infection cycle of T2 bacteriophage, shown in Figure 1.4B. The special feature of the lysogenic cycle is the insertion of the phage genome into the bacterium's chromosomal DNA, where it can remain quiescent for many generations.
(A) In the λ genome, the genes are arranged into functional groups. For example, the region marked as ‘protein coat’ comprises 21 genes coding for proteins that are either components of the phage capsid or are required for capsid assembly, and ‘cell lysis’ comprises four genes involved in lysis of the bacterium at the end of the lytic phase of the infection cycle. The regions of the genome that can be deleted without impairing the ability of the phage to follow the lytic cycle are indicated in green. (B) The differences between a λ insertion vector and a λ replacement vector.
Insertion vectors, in which part or all of the optional DNA has been removed and a unique restriction site introduced at some position within the trimmed down genome;
Replacement vectors, in which the optional DNA is contained within a stuffer fragment, flanked by a pair of restriction sites, that is replaced when the DNA to be cloned is ligated into the vector.
The linear form of the vector is shown at the top of the diagram. Treatment with the appropriate restriction endonuclease produces the left and right arms, both of which have one blunt end and one end with the 12-nucleotide overhang of the cos site. The DNA to be cloned is blunt ended and so is inserted between the two arms during the ligation step. These arms also ligate to one another via their cos sites, forming a concatamer. Some parts of the concatamer comprise left arm-insert DNA-right arm and, assuming this combination is 37–52 kb in length, will be enclosed inside the capsid by the in vitro packaging mix. Parts of the concatamer made up of left arm ligated directly to right arm, without new DNA, are too short to be packaged.
After infection, the cells are spread onto an agar plate. The objective is not to obtain individual colonies but to produce an even layer of bacteria across the entire surface of the agar. Bacteria that were infected with the packaged cloning vector die within about 20 minutes because the λ genes contained in the arms of the vector direct replication of the DNA and synthesis of new phages by the lytic cycle, each of these new phages containing its own copy of the vector plus cloned DNA. Death and lysis of the bacterium releases these phages into the surrounding medium, where they infect new cells and begin another round of phage replication and lysis. The end result is a zone of clearing, called a plaque, which is visible on the lawn of bacteria that grows on the agar plate (Figure 4.23
| Number of clones* | |||
|---|---|---|---|
| Type of vector | Insert size (kb) | P = 95% | P = 99% |
| λ replacement | 18 | 532 500 | 820 000 |
| Cosmid, fosmid | 40 | 240 000 | 370 000 |
| P1 | 125 | 77 000 | 118 000 |
| BAC, PAC | 300 | 32 000 | 50 000 |
| YAC | 600 | 16 000 | 24 500 |
| Mega-YAC | 1400 | 6850 | 10 500 |
Calculated from the equation:

where N is the number of clones required, P is the probability that any given segment of the genome is present in the library, a is the average size of the DNA fragments inserted into the vector, and b is the size of the genome.
pJB8 is 5.4 kb in size and carries the ampicillin-resistance gene (amp R), a segment of λ DNA containing the cos site, and an Escherichia coli origin of replication (ori).
The first major breakthrough in attempts to clone DNA fragments much longer than 50 kb came with the invention of yeast artificial chromosomes or YACs (Burke et al., 1987). These vectors are propagated in S. cerevisiae rather than in E. coli and are based on chromosomes, rather than on plasmids or viruses. The first YACs were constructed after studies of natural chromosomes had shown that, in addition to the genes that it carries, each chromosome has three important components:
The centromere, which plays a critical role during nuclear division (see Figure 2.7);
The telomeres, the special sequences which mark the ends of chromosomal DNA molecules (see Figure 2.10);
One or more origins of replication, which initiate synthesis of new DNA when the chromosome divides (Section 13.2.1).
(A) The cloning vector pYAC3. (B) To clone with pYAC3, the circular vector is digested with BamHI and SnaBI. BamHI restriction removes the stuffer fragment held between the two telomeres in the circular molecule. SnaBI cuts within the SUP4 gene and provides the site into which new DNA will be inserted. Ligation of the two vector arms with new DNA produces the structure shown at the bottom. This structure carries functional copies of the TRP1 and URA3 selectable markers. The host strain has inactivated copies of these genes, which means that it requires tryptophan and uracil as nutrients. After transformation, cells are plated onto a minimal medium, lacking tryptophan and uracil. Only cells that contain the vector, and so can synthesize tryptophan and uracil, are able to survive on this medium and produce colonies. Note that if a vector comprises two right arms, or two left arms, then it will not give rise to colonies because the transformed cells will still require one of the nutrients. The presence of insert DNA in the cloned vector molecules is checked by testing for inactivation of SUP4. This is done by a color test: on the appropriate medium, colonies containing recombinant vectors (i.e. with an insert) are white; non-recombinants (vector but no insert) are red.
Bacteriophage P1 vectors (Sternberg, 1990) are very similar to λ vectors, being based on a deleted version of a natural phage genome, the capacity of the cloning vector being determined by the size of the deletion and the space within the phage particle. The P1 genome is larger than the λ genome, and the phage particle is bigger, so a P1 vector can clone larger fragments of DNA than a λ vector, up to 125 kb using current technology.
Bacterial artificial chromosomes or BACs (Shizuya et al., 1992) are based on the naturally occurring F plasmid of E. coli. Unlike the plasmids used to construct the early cloning vectors, the F plasmid is relatively large and vectors based on it have a higher capacity for accepting inserted DNA. BACs can be used to clone fragments of 300 kb and longer.
P1-derived artificial chromosomes or PACs (Ioannou et al., 1994) combine features of P1 vectors and BACs and have a capacity of up to 300 kb.
Fosmids (Kim et al., 1992) contain the F plasmid origin of replication and a λ cos site. They are similar to cosmids but have a lower copy number in E. coli, which means that they are less prone to instability problems.
Cloning is not merely an aid to DNA sequencing: it also provides a means of studying the mode of expression of a gene and the way in which expression is regulated, of carrying out genetic engineering experiments aimed at modifying the biological characteristics of the host organism, and of synthesizing important animal proteins, such as pharmaceuticals, in a new host cell from which the proteins can be obtained in larger quantities than is possible by conventional purification from animal tissue. These multifarious applications demand that genes must frequently be cloned in organisms other than E. coli.
(A) YIp5, a typical yeast integrative plasmid. The plasmid contains the ampicillin-resistance gene (amp R), the tetracycline-resistance gene (tet R), the yeast gene URA3, and an Escherichia coli origin of replication (ori). The presence of the E. coli ori means that recombinant YIp5 molecules can be constructed in E. coli before their transfer into yeast cells. YIp5 is therefore a shuttle vector - it can be shuttled between two species. (B) YIp5 has no origin of replication that can function inside yeast cells, but can survive if it integrates into the yeast chromosomal DNA by homologous recombination between the plasmid and chromosomal copies of the URA3 gene. The chromosomal gene carries a small mutation which means that it is non-functional and the host cells are ura3 -. One of the pair of URA3 genes that are formed after integration of the plasmid DNA is mutated, but the other is not. Recombinant cells are therefore ura + and can be selected by plating on to minimal medium, which does not contain uracil.
In essence, DNA cloning results in the purification of a single fragment of DNA from a complex mixture of DNA molecules. Cloning is a powerful technique and its impact on our understanding of genes and genomes has been immeasurable. Cloning does, however, have one major disadvantage: it is a time-consuming and, in parts, difficult procedure. It takes several days to perform the manipulations needed to insert DNA fragments into a cloning vector and then introduce the ligated molecules into the host cells and select recombinants. If the experimental strategy involves generation of a large clone library, followed by screening of the library to identify a clone that contains a gene of interest (see Technical Note 4.3), then several more weeks or even months might be needed to complete the project.
PCR complements DNA cloning in that it enables the same result to be achieved - purification of a specified DNA fragment - but in a much shorter time, perhaps just a few hours (Saiki et al., 1988). PCR is complementary to, not a replacement for, cloning because it has its own limitations, the most important of which is the need to know the sequence of at least part of the fragment that is to be purified. Despite this constraint, PCR has acquired central importance in many areas of molecular biology research. We will examine the technique first and then survey its applications.
See the text for details.
The PCR has been carried out in a microfuge tube. A sample is loaded into lane 2 of an agarose gel. Lane 1 contains DNA size markers, and lane 3 contains a sample of a PCR done by a colleague. After electrophoresis, the gel is stained with ethidium bromide (see Technical Note 2.1). Lane 2 contains a single band of the expected size, showing that the PCR has been successful. In lane 3 there is no band - this PCR has not worked.
PCR is such a straightforward procedure that it is sometimes difficult to understand how it can have become so important in modern research. First we will deal with its limitations. In order to synthesize primers that will anneal at the correct positions, the sequences of the boundary regions of the DNA to be amplified must be known. This means that PCR cannot be used to purify fragments of genes or other parts of a genome that have never been studied before. A second constraint is the length of DNA that can be copied. Regions of up to 5 kb can be amplified without too much difficulty, and longer amplifications - up to 40 kb - are possible using modifications of the standard technique. However, the >100 kb fragments that are needed for genome sequencing projects are unattainable by PCR.
What are the strengths of PCR? Primary among these is the ease with which products representing a single segment of the genome can be obtained from a number of different DNA samples. We will encounter one important example of this in the next chapter when we look at how DNA markers are typed in genetic mapping projects (Section 5.2.2). PCR is used in a similar way to screen human DNA samples for mutations associated with genetic diseases such as thalassemia and cystic fibrosis. It also forms the basis of genetic profiling, in which variations in microsatellite length are typed (see Figure 2.25).
A second important feature of PCR is its ability to work with minuscule amounts of starting DNA. This means that PCR can be used to obtain sequences from the trace amounts of DNA that are present in hairs, bloodstains and other forensic specimens, and from bones and other remains preserved at archaeological sites. In clinical diagnosis, PCR is able to detect the presence of viral DNA well before the virus has reached the levels needed to initiate a disease response. This is particularly important in the early identification of viral-induced cancers because it means that treatment programs can be initiated before the cancer becomes established.
The above are just a few of the applications of PCR. The technique is now a major component of the molecular biologist's toolkit and we will discover many more examples of its use in the study of genomes as we progress through the remaining chapters of this book.
Give short definitions of the following terms:
2 μm circle
Bacterial artificial chromosome (BAC)
Complementary DNA (cDNA)
cos site
Denaturation of protein
Hybridization probe
In vitro packaging
Knockout mice
Linker
P1-derived artificial chromosome (PAC)
Polymerase chain reaction (PCR)
Yeast artificial chromosome (YAC)
Draw diagrams that outline the events that occur during (a) DNA cloning, and (b) PCR. What are the limitations of each of these two techniques?
List the types of enzyme used in recombinant DNA research.
Distinguish between the two types of exonuclease activity that can be possessed by a DNA polymerase, and explain how these activities influence the potential applications of individual DNA polymerases in recombinant DNA research.
Using examples, describe the various types of end produced after digestion of DNA with a restriction endonuclease.
How are agarose gel electrophoresis and Southern hybridization used to examine the results of a restriction digest?
Explain why the efficiency of blunt-end ligation is less than that of sticky-end ligation. What steps can be taken to improve the efficiency of blunt-end ligation?
Draw diagrams of (a) pBR322, and (b) pUC8. Explain how the differences between these two vectors influence the ways in which they are used to clone DNA fragments.
Distinguish between the lytic and lysogenic infection cycles for a bacteriophage.
Write a short description of the way in which a bacteriophage λ vector is used to clone DNA. How does a cosmid differ from a standard λ vector?
Draw a diagram showing a typical YAC. Indicate the key features and explain how a YAC is used to clone DNA.
What problems might arise when a YAC is used to clone a large fragment of DNA? To what extent can these problems be solved by the use of other types of high-capacity cloning vector?
How is DNA cloned in organisms other than Escherichia coli?
Describe how a PCR is carried out, paying particular attention to the role of the primers and the temperatures used during the thermal cycling.
Soon after the first gene cloning experiments were carried out in the early 1970s, a number of scientists argued that there should be a temporary moratorium on this type of research. What was the basis of these scientists' fears and to what extent were these fears justified?
What would be the features of an ideal cloning vector? To what extent are these requirements met by any of the existing cloning vectors?
The specificity of the primers is a critical feature of a successful PCR. If the primers anneal at more than one position in the target DNA then products additional to the one being sought will be synthesized. Explore the factors that determine primer specificity and evaluate the influence of the annealing temperature on the outcome of a PCR.
Free Full text in PMC]
Free Full text in PMC]