6.1. Basic features of PCR
The polymerase chain reaction (PCR) has revolutionized molecular genetics by
permitting rapid cloning and analysis of DNA. Since the first reports describing
this new technology in the mid 1980s, there have been numerous applications in both
basic and clinical research. Two other fundamental technologies are DNA sequencing
and in vitro mutagenesis, both of which can be accomplished using
PCR-based and non PCR-based methods.
6.1.1. PCR is a cell-free method of DNA cloning
The standard PCR reaction: selective DNA amplification
Figure 6.1
.
PCR is an in vitro method for amplifying DNA
sequences using defined oligonucleotide primers
Oligonucleotide primers A and B are complementary to DNA
sequences located on opposite DNA strands and flanking the
region to be amplified. Annealed primers are incorporated into
the newly synthesized DNA strands. The first cycle will result
in two new DNA strands whose 5′ end is fixed by the
position of the oligonucleotide primer but whose 3′
end is variable (‘ragged’ 3′
ends). The two new strands can serve in turn as templates for
synthesis of complementary strands of the desired length (the
5′ ends are defined by the primer and the 3′
ends are fixed because synthesis cannot proceed past the
terminus of the opposing primer). After a few cycles, the
desired fixed length product begins to predominate.
PCR is a rapid and versatile
in vitro method for amplifying
defined
target DNA sequences present within a source of DNA. Usually, the
method is designed to permit
selective amplification of a
specific
target DNA sequence(s) within a heterogeneous collection of DNA
sequences (e.g. total genomic DNA or a complex cDNA population). To permit
such selective amplification, some prior DNA sequence information from the
target sequences is required. This information is used to design two
oligonucleotide primers (
amplimers) which are specific for the target sequence and which
are often about 15–25 nucleotides long. After the primers are
added to denatured template DNA, they bind specifically to complementary DNA
sequences at the target site. In the presence of a suitably heat-stable DNA
polymerase and DNA precursors (the four deoxynucleoside triphosphates, dATP,
dCTP, dGTP and dTTP), they initiate the synthesis of new DNA strands which
are complementary to the individual DNA strands of the
target DNA segment,
and which will overlap each other ().
The PCR is a chain reaction because newly synthesized DNA strands will act as
templates for further DNA synthesis in subsequent cycles. After about 25
cycles of DNA synthesis, the products of the PCR will include, in addition
to the starting DNA, about 105 copies of the specific target
sequence, an amount which is easily visualized as a discrete band of a
specific size when submitted to agarose gel electrophoresis. A heat-stable
DNA polymerase is used because the reaction involves sequential cycles
composed of three steps:
-
Denaturation, typically at about 93–95°C for
human genomic DNA.
-
Reannealing at temperatures usually from about 50°C to
70°C depending on the Tm
(see Section 5.2.1) of
the expected duplex (the annealing temperature is typically
about 5°C below the calculated
Tm).
-
DNA synthesis, typically at about 70–75°C.
Suitably heat-stable DNA polymerases have been obtained from microorganisms
whose natural habitat is hot springs. For example, the widely used
Taq DNA polymerase is obtained from Thermus
aquaticus and is thermostable up to 94°C, with an
optimum working temperature of 80°C.
Specificity of amplification and primer design
Figure 6.2
.
PCR primer design
The specificity of amplification depends on the extent to which the primers
can recognize and bind to sequences other than the intended
target DNA
sequences. For complex DNA sources, such as total genomic DNA from a
mammalian cell, it is often sufficient to design two primers about 20
nucleotides long. This is because the chance of an accidental perfect match
elsewhere in the
genome for either one of the primers is extremely low, and
for both sequences to occur by chance in close proximity in the specified
direction is normally exceedingly low. Although conditions are usually
chosen to ensure that only strongly matched
primer-target duplexes are
stable, spurious amplification products can nevertheless be observed. This
can happen if one or both chosen
primer sequences contain part of a
repetitive DNA sequence, and primers are usually designed to avoid matching
to known
repetitive DNA sequences, including large runs of a single
nucleotide ().
Accidental matching at the 3′ end of the primer is critically
important: spurious products may derive from substantially mismatched
primer-target duplexes unless the 3′ end of the primer shows
perfect matching. Several strategies can be adopted to optimize reaction
specificity:
-
Nested primers. The products of an initial
amplification reaction are diluted and used as the target DNA
source for a second reaction in which a different set of primers
is used, corresponding to sequences located close, but internal,
to those used in the first reaction.
-
Hot-start PCR. Mixing of all PCR reagents prior to
an initial heat denaturation step allows more opportunity for
nonspecific binding of primer sequences. To reduce this
possibility, one or more components of the PCR are physically
separated until the first denaturation step. A popular approach
is to use a specially formulated wax bead designed to fit snugly
within a PCR reaction tube. The reaction components minus the
enzyme and reaction buffer are added to the tube followed by the
molten wax bead which floats on top and then solidifies on
cooling. The thermostable polymerase is then added with buffer.
At the initial denaturation step the wax melts again and rises
to the surface causing all the reaction components to come into
contact with each other.
-
Touch-down PCR. Most thermal cyclers can be
programed to perform runs in which the annealing temperature is
lowered incrementally during the PCR cycling from an initial
value above the expected Tm to a
value below the Tm. By keeping the
stringency of hybridization initially very high, the formation
of spurious products is discouraged, allowing the expected
sequence to predominate.
DNA labeling by PCR
The standard PCR reaction can be modified to permit incorporation of labeled
nucleotides. Two methods are commonly used:
Figure 6.18
.
Cycle sequencing involves linear amplification using a single
primer to initiate DNA synthesis
Cycle sequencing using the dideoxynucleotide method involves setting
up four parallel DNA sequencing reactions in which DNA synthesis
occurs, using a mix of all four dNTPs plus one of the four ddNTPs.
The reactions resemble PCR reactions because they involve the same
thermocycling format as PCR. Since only a single primer is used, the
product accumulates in a linear fashion, rather than exponentially
as in PCR. In this example, label is introduced at the 3′
terminal end of a DNA strand when a labeled ddNTP is introduced.
However, an alternative is to use a primer carrying a labeled group
at its 5′ end (primer-mediated 5′
end-labeling).
Figure 6.20
.
PCR mutagenesis
(A) 5′ add-on mutagenesis. Primers can be
modified at the 5′ end to introduce, for example, a
labeled group (Figure
10.24), a sequence containing a suitable restriction site
(Figure 20.12) or a
phage promoter to drive gene expression. (B)
Site-specific mutagenesis. The mutagenesis shown can result in an
amplified product with a specific pre-determined mutation located in
a central segment. PCR reactions A and B are envisaged as amplifying
overlapping segments of DNA containing an introduced mutation (by
deliberate base mismatching using a mutant primer - 1M or 2M). After
the two products are combined, denatured and allowed to reanneal,
the DNA polymerase can extend the 3′ end of heteroduplexes
with recessed 3′ ends. Thereafter, a full length product
with the introduced mutation in a central segment can be amplified
by using the outer primers 1 and 2 only.
6.1.2. The major advantages of PCR as a cloning method are its rapidity, sensitivity
and robustness
Because of its simplicity, PCR is a popular technique with a wide range of
applications which depend on essentially three major advantages of the
method.
Speed and ease of use
DNA cloning by PCR can be performed in a few hours, using relatively
unsophisticated equipment. Typically, a PCR reaction consists of 30 cycles
containing a denaturation, synthesis and reannealing step, with an
individual cycle typically taking 3–5 min in an automated thermal
cycler. This compares favorably with the time required for cell-based DNA
cloning, which may take weeks. Clearly, some time is also required for
designing and synthesizing oligonucleotide primers, but this has been
simplified by the availability of computer software for primer design and
rapid commercial synthesis of custom oligonucleotides. Once the conditions
for a reaction have been tested, the reaction can then be repeated
simply.
Sensitivity
PCR is capable of amplifying sequences from minute amounts of target DNA,
even the DNA from a single cell (Li
et al., 1988). Such exquisite sensitivity
has afforded new methods of studying molecular pathogenesis and has found
numerous applications in forensic science, in diagnosis, in genetic linkage
analysis using single-sperm typing and in molecular paleontology studies,
where samples may contain minute numbers of cells. However, the extreme
sensitivity of the method means that great care has to be taken to avoid
contamination of the sample under investigation by external DNA, such as
from minute amounts of cells from the operator.
Robustness
PCR can permit amplification of specific sequences from material in which the
DNA is badly degraded or embedded in a medium from which conventional DNA
isolation is problematic. As a result, it is again very suitable for
molecular anthropology and paleontology studies, for example the analysis of
DNA recovered from archaeological remains. It has also been used
successfully to amplify DNA from formalin-fixed tissue samples, which has
important applications in molecular pathology and, in some cases, genetic
linkage studies.
6.1.3. The major disadvantages of PCR are the general requirement for prior target
sequence information, short size and limiting amounts of product, and infidelity
of DNA replication
Despite its huge popularity, PCR has certain limitations as a method for
selectively cloning specific DNA sequences.
Need for target DNA sequence information
In order to construct specific oligonucleotide primers that permit selective
amplification of a particular DNA sequence, some prior sequence information
is necessary. This normally means that the DNA region of interest has been
partly characterized previously, often following cell-based DNA cloning.
However, a variety of techniques have been developed that reduce or even
exclude the need for prior DNA sequence information concerning the target
DNA, when certain aims are to be met. For example, previously
uncharacterized DNA sequences can sometimes be cloned using PCR with
degenerate oligonucleotides if they are members of a gene or repetitive DNA
family at least one of whose members has previously been characterized. In
some cases, PCR can be used effectively without any prior sequence
information concerning the target DNA to permit indiscriminate
amplification of DNA sequences from a source of DNA that is
present in extemely limited quantities (Section 6.2.4). Therefore, although PCR can be applied to ensure
whole genome amplification, it does not have the advantage of cell-based DNA
cloning in offering a way of separating the individual DNA clones comprising
a genomic DNA library.
Short size and limiting amounts of PCR product
A clear disadvantage of PCR as a DNA cloning method has been the size range
of the DNA sequences that can be cloned. Unlike cell-based DNA cloning where
the size of cloned DNA sequences can approach 2 Mb (
Section 4.3.4), reported DNA sequences cloned by PCR
have typically been in the 0.1–5 kb size range, often at the lower
end of this scale. Although small segments of DNA can usually be amplified
easily by PCR, it becomes increasingly more difficult to obtain efficient
amplification as the desired product length increases. Recently, however,
conditions have been identified for effective amplification of longer
targets, including a 42-kb product from the bacteriophage λ
genome.
Often, the conditions for long range PCR involve a combination of
modifications to standard conditions with a two-polymerase system. This
provides optimal levels of DNA polymerase and 3′ →
5′ exonuclease activity which serves as a
proofreading mechanism
(see
Box 6.1).
The amount of PCR product obtained in a single reaction is also much more
limited than the amount that can be obtained using cell-based cloning where
scale-up of the volumes of cell cultures is possible. The efficiency of a
PCR reaction will vary from template to template and according to various
factors that are required to optimize the reaction but typically only
comparatively small amounts of product are achieved.
Infidelity of DNA replication
Cell-based DNA cloning involves DNA replication
in vivo,
which is associated with a very high fidelity of copying because of
proofreading mechanisms (see
Box
6.1). However, when DNA is replicated
in vitro
the copying error rate is considerably greater. Of the heat-stable DNA
polymerases required for PCR, the most widely used is
Taq
DNA polymerase derived from
T. aquaticus. This DNA
polymerase, however, has no associated 3′ → 5′
exonuclease to confer a
proofreading function, and the error rate due to
base misincorporation during DNA replication is rather high: for a 1 kb
sequence that has undergone 20 effective cycles of duplication,
approximately 40% of the new DNA strands synthesized by PCR using this
enzyme will contain an incorrect nucleotide resulting from a copying error.
This means that, even if the PCR reaction involves amplification of a single
DNA sequence, the final product will be a mixture of extremely similar, but
not identical DNA sequences.
Despite the errors due to replication in vitro, DNA
sequencing of the total PCR product may give the correct sequence. This is
because, although individual DNA strands in the PCR product often contain
incorrect bases, the incorporation of incorrect bases is essentially random.
As a result, for each base position, the contribution of
one incorrect base on one or more strands is overwhelmed by the
contributions from the huge majority of strands which will have the correct
sequence. What it does mean, however, is that further analysis of the
product may be difficult. If the PCR product is to be cloned in cells (e.g.
to facilitate DNA sequencing or to permit functional studies in a cell-based
expression system), transformation selects for a single molecule, and the
cell clones chosen to be amplified will contain identical molecules, each
the same as a single starting molecule which may well have the incorrect DNA
sequence because of a copying error during PCR amplification. As a result,
several individual clones may need to be sequenced in order to determine the
correct (consensus) sequence, before selecting one with the authentic
sequence for subsequent experiments.
More recently, the problem of infidelity of DNA replication during the PCR
reaction has been considerably reduced by using alternative heat-stable DNA
polymerases which have associated 3′–5′
exonuclease activity. For example, the Pyrococcus furiosus
DNA polymerase is becoming more widely used because of the proofreading
conferred by its associated 3′–5′ exonuclease
activity (Cline et al.,
1996). The resulting PCR product has a much lower level of
mutations introduced by copying errors: for a 1 kb segment of DNA that has
undergone 20 effective cycles of duplication, about 3.5% of the DNA strands
in the product carry an altered base.
6.1.4. Cell-based cloning of PCR amplification products is often required to permit
subsequent structural and functional studies
The amount of material that can be cloned in a single PCR reaction is limited,
and it is time-consuming and expensive to repeat the same PCR reaction many
times to achieve large quantities of the desired DNA. In addition, the PCR
product may not be in a suitable form that will permit some subsequent studies.
As a result, it is often convenient to clone the PCR product in a cell-based
cloning system in order to obtain large quantities of the desired DNA and to
permit a variety of analyses. As described in the previous section, it is
important to verify that the sequence of the cloned product is representative of
the original PCR product.
Figure 6.3
.
Cloning of PCR products in bacterial cells
PCR products frequently have an overhanging adenosine at their
3′ ends (see text). The T-A cloning system has a
polylinker system with complementary thymine overhangs to
facilitate cloning. An alternative is to trim back the adenine
overhangs using a suitable ‘polishing’
enzyme, which leaves the fragment blunt-ended.
Various plasmid cloning systems are used to propagate PCR-cloned DNA in bacterial
cells. Once cloned, the insert can be cut out using suitable restriction
nucleases and transferred into other plasmids which may have specialized usages
in permitting expression to give an RNA product, or to provide large quantities
of a protein, etc. Several thermostable polymerases including
Taq DNA polymerase have a terminal deoxynucleotidyl
transferase activity which selectively modifies PCR-generated fragments by
adding a single nucleotide, generally adenine, to the 3′ ends of
amplified DNA fragments. The resulting overhangs can make it difficult to clone
PCR products and a variety of approaches are commonly used to facilitate
cloning, including the use of vectors with overhanging T residues in their
cloning site polylinker and the use of ‘polishing’ enzymes
such as T4 polymerase or
Pfu polymerase which can remove the
overhanging single nucleotides ().
6.2. Applications of PCR
Figure 6.4
.
PCR has numerous general applications
The figure illustrates general applications. Specific applications are
described in separate chapters. Genome walking means accessing
uncharacterized DNA starting from a neighboring characterized
sequence.
Although PCR was first developed only a decade and a half ago, the simplicity and the
versatility of the technique have ensured that it is among the most ubiquitous of
molecular genetic methodologies, with a wide range of general applications ().
6.2.1. PCR enables rapid amplification of template DNA for screening of
uncharacterized mutations
Because of its rapidity and simplicity, PCR is ideally suited to providing
numerous DNA templates for mutation screening. Partial DNA sequences, at the
genomic or the cDNA level, from a gene associated with disease, or some other
interesting phenotype, immediately enable gene-specific PCR reactions to be
designed. Amplification of the appropriate gene segment then enables rapid
testing for the presence of associated mutations in large numbers of
individuals. By contrast, cell-based DNA cloning of the gene from numerous
different individuals is far too slow and labor-intensive to be considered as a
serious alternative.
Figure 6.5
.
PCR products for gene mutation screening are obtained from
genomic DNA using intron-specific primers flanking exons or by
RT-PCR
(A) Genomic DNA. Exons 1–4 can be amplified
separately from genomic DNA using pairs of intron-specific primers
1F + 1R, 2F + 2R, etc. (B) RT-PCR. This relies on at
least some mRNA being present in easily accessible cells such as
blood cells, permitting conversion to cDNA. The cDNA can then be
used as a template for pairs of exon-specific primers (1F+1R, 2F+2R,
etc.) to generate overlapping DNA fragments.
Typically, the identification of
exon-
intron boundaries and sequencing of the
ends of introns of a gene of interest offers the possibility of genomic mutation
screening. Individual
exon-specific amplification reactions are developed by
designing primers which recognize intronic sequences located close to the
exon-
intron boundary (). The
resulting PCR products are then analyzed by rapid mutation-screening methods, in
which the optimal size for mutation screening is usually about 200 bp (see
Section 15.5.1). Conveniently, the average
size of a human
exon is about 180 bp but, in the case of very large exons, it is
usual to design a series of primers to generate overlapping exonic products. PCR
can also quickly provide amplified cDNA sequences for mutation screening. Such
cDNA mutation screening may be the only way in which mutations can be screened
if the
exon-
intron organization of a gene has not been established. To do this,
mRNA is isolated from a convenient source of
tissue, such as blood cells,
converted into cDNA using
reverse transcriptase and the cDNA is used as a
template for a PCR reaction. This version of the standard genomic PCR reaction
is consequently often referred to as
RT-PCR (
reverse transcriptase-PCR; ). Clearly, the method is ideally suited to
genes expressed at high levels in easily accessible cells, such as blood cells.
However, as a result of low level
ectopic
transcription of genes in all tissues, it has also been applied to
transcript analysis of genes which are not significantly expressed in blood
cells, such as the dystrophin (
DMD) gene (
Chelly et al., 1989).
6.2.2. PCR permits rapid genotyping for polymorphic markers
Figure 6.6
.
Restriction site polymorphisms can easily be typed by PCR as an
alternative to laborious RFLP assays
Alleles 1 and 2 are distinguished by a polymorphism which alters the
nucleotide sequence of a specific restriction site for restriction
nuclease R: allele 1 possesses the site, but allele 2 has an altered
nucleotide(s) X, X' and so lacks it. PCR primers can be designed
simply from sequences flanking the restriction site to produce a
short product. Digestion of the PCR product with enzyme R and
size-fractionation can result in simple typing for the two
alleles.
Restriction site polymorphisms (RSPs) result in alleles possessing or lacking a
specific
restriction site. Such polymorphisms can be typed using
Southern blot
hybridization. A DNA
probe representing the
locus is hybridized against genomic
DNA samples that have been digested with the appropriate restriction enzyme and
size-fractionated by agarose gel electrophoresis. The resulting RFLPs have two
alleles corresponding to the presence or absence of the
restriction site (
Section 5.3.3). As a convenient alternative
to RFLPs, PCR can type RSPs by simply designing primers using sequences which
flank the polymorphic
restriction site, amplifying from genomic DNA, then
cutting the PCR product with the appropriate restriction enzyme and separating
the fragments by agarose gel electrophoresis ().
Figure 6.7
.
PCR can be used to type short tandem repeat polymorphisms
(STRPs)
The example illustrates typing of a (CA)/(TG) dinucleotide repeat
polymorphism which has three alleles as a result of variation in the
number of the (CA) repeats. On the autoradiograph each
allele is
represented by a major upper band and two minor ‘shadow
bands’ (see ). Individuals A and B have genotypes (in brackets) as
follows: A (1,3); B (2,2).
Figure 6.8
.
Example of typing for a CA repeat
The example illustrated shows typing of members of a large family
with the (CA)/(TG) marker D17S800. Arrows to the
left mark the top (main) band seen in different alleles
1–7. Note that individual alleles show a
strong upper band followed by two lower ‘shadow
bands’, one of intermediate intensity immediately
underneath the strong upper band, and one that is very faint and is
located immediately below the first shadow band. For the indicated
individuals, the genotypes (in brackets) are as follows: 1 (3,6); 2
(1,5); 3 (3,5); 4 (2,5); 5 (3,6); 6 (2,5); 7 (3,5); 8 (3,6); 9
(3,5); 10 (5,7); 11 (3,3); 12 (2,4); 13 (3,3); 14 (3,6); 15 (3,3);
16 (3,4). Note that in the latter case, the middle
band is particularly intense because it contains both the main band
for allele 4 plus the major shadow band for allele 3. Slipped strand
mispairing (see Section
9.3.1) is thought to be the major mechanism responsible
for producing shadow bands at tandem dinucleotide repeats (Hauge and Litt, 1993).
Short tandem repeat polymorphisms (STRPs), also called
microsatellite markers,
consist of a short sequence, typically from one to four nucleotides long, that
is tandemly repeated several times, and often characterized by many alleles. For
example, (CA)
n/(TG)
n repeats are often polymorphic when
n exceeds 12, and
have been widely used as polymorphic markers in the human
genome (see below).
Increasingly, however, trinucleotide and tetranucleotide
marker polymorphisms
are being typed. In each case the STRPs can be typed conveniently by PCR.
Primers are designed from sequences known to flank a specific STRP
locus,
permitting PCR amplification of alleles whose sizes differ by integral repeat
units (). The PCR products can
then be size-fractionated by polyacrylamide gel electrophoresis. The PCR
normally includes a radioactive or fluorescent nucleotide precursor which
becomes incorporated into the small PCR products and facilitates their
detection. To ensure adequate size fractionation of alleles, the PCR products
are denatured prior to electrophoresis. An example of the use of a CA repeat
marker is shown in .
6.2.3. A wide variety of PCR-based methods can be used to assay for known
mutations
PCR is a very rapid and valuable tool for detecting pathogenic mutations and
other mutations of interest. The examples below illustrate some popular
methods.
Allelic discrimination by size or susceptibility to restriction
enzyme
Small insertions or deletions (such as the three nucleotide deletion in the
common cystic fibrosis (CFTR)
allele, F508del) can be simply detected by
designing primers from regions closely flanking the mutation site and
distinguishing the normal and mutant alleles by size on polyacrylamide or
agarose gels. If the mutation changes a
restriction site, mutant and normal
alleles can be distinguished by amplifying across the mutant site and
digesting the PCR product with relevant restriction endonuclease, exactly as
in
.Allelic discrimination by susceptibility to an artificially introduced
restriction site
Even if the mutation does not result in a restriction site difference, it may
be possible to exploit the difference between normal and mutant alleles by
amplificationcreated restriction site PCR. This is a form of mismatched
primer mutagenesis (see Section
6.4.2) in which a primer is deliberately designed from sequence
immediately adjacent to, but not encompassing, the restriction site. The
primer is deliberately designed to have a mismatched nucleotide which
together with the sequence of the mutant site creates a restriction site not
present in normal alleles (see Figure
17.2 for a specific example).
Allele-specific PCR (ARMS test)
Figure 6.9
.
Correct base-pairing at the 3′ end of PCR primers
is the basis of allele-specific PCR
The allele-specific oligonucleotide primers ASP1 and ASP2 are
designed to be identical to the sequence of the two alleles over
a region preceding the position of the variant nucleotide,
up to and terminating in the variant nucleotide
itself. ASP1 will bind perfectly to the
complementary strand of the allele 1 sequence, permitting
amplification with the conserved primer. However, the
3′-terminal C of the ASP2 primer mismatches with the T
of the allele 1 sequence, making amplification impossible.
Similarly ASP2 can bind perfectly to allele 2 and initiate
amplification, unlike ASP1.
Oligonucleotide primers can be designed so as to discriminate between target
DNA sequences that differ by a single nucleotide in the region of interest.
This is a form of
allele-specific PCR, the PCR equivalent of the
allele-specific hybridization which is possible with ASO probes (
Section 5.3.1). In the case of
allele-specific hybridization, alternative ASO probes are designed to have
differences in a central segment of the sequence (to maximize thermodynamic
instability of mismatched duplexes). However, in the case of
allele-specific
PCR, ASO primers are designed to differ at the nucleotide that occurs
at the extreme 3′
terminus.
This is so because the DNA synthesis step in a PCR reaction is crucially
dependent on correct base-pairing at the 3′ end (). This method can be used to
type specific alleles at a polymorphic
locus, but has found particular use
as a method for detecting a specific pathogenic mutation, the so-called
amplification refractory mutation system (ARMS;
Newton et al., 1989).
Mutation detection using the 5′ → 3′
exonuclease activity of Taq DNA polymerase (TaqMan™
assay)
Figure 6.10
.
The TaqMan™ 5′ exonuclease
assay
In addition to two conventional PCR primers, P1 and P2, which are
specific for the target sequence, a third primer, P3, is
designed to bind specifically to a site on the target sequence
downstream of the P1 binding site. P3 is labeled with two
fluorophores, a reporter dye (R) is attached at the 5′
end, and a quencher dye (D), which has a different emission
wavelength to the reporter dye, is attached at its 3′
end. Because its 3′ end is blocked, primer P3 cannot
by itself prime any new DNA synthesis. During the PCR reaction,
Taq DNA polymerase synthesizes a new DNA
strand primed by P1 and as the enzyme approaches P3, its
5′ → 3′ exonuclease activity
processively degrades the P3 primer from its 5′ end.
The end result is that the nascent DNA strand extends beyond the
P3 binding site and the reporter and quencher dyes are no longer
bound to the same molecule. As the reporter dye is no longer in
close proximity to the quencher, the resulting increase in
reporter emission intensity is easily detected.
Taq polymerase does not possess a
proofreading 3′ →
5′ exonuclease activity but does possess a 5′ →
3′ exonuclease activity. This property can be exploited to
facilitate detection of specific alleles (
Holland et al., 1991;
Lee et al., 1993). Such an assay
involves hybridization of three primers, the third
primer being intended to
bind just
downstream of one of the conventional primers which should be
allele-specific. The additional
primer carries a blocking group at the
3′ terminal nucleotide so that it cannot prime new DNA synthesis
and at its 5′ end carries a labeled group. In modern versions of
the assay, the label is a fluorogenic group and the third
primer also
carries a quencher group (see ). If the
upstream primer which is bound to the same strand
is able to prime successfully,
Taq DNA polymerase will
extend a new DNA strand until it encounters the third
primer in which case
its 5′ → 3′ exonuclease will degrade the
primer
causing release of separate nucleotides containing the dye and the quencher,
and an observable increase in
fluorescence.
6.2.4. Degenerate oligonucleotide primers and primers specific for ligated linker
sequences permit co-amplification of sequence families, or even indiscriminate
amplification
Figure 6.11
.
DOP-PCR can permit cDNA cloning using degenerate
oligonucleotides
The figure illustrates cloning of a cDNA for porcine urate oxidase
using degenerate oligonucleotides corresponding to a known amino
acid sequence. The sense primer was constructed to correspond to the
codons 7–11 plus the first two bases of codon 12, and the
antisense primer corresponded to codons 34–38 (Lee et al.,
1988). The amino acid sequences chosen for constructing
primers were selected on the basis of their high content of amino
acids which were specified by only two codons (Asp, Tyr, Lys, Asn,
His, see Figure 1.22). The
primers have 5′ extensions containing recognition
sequences for restriction nucleases, in order to facilitate
subsequent cell-based cloning.
DOP-PCR (degenerate oligonucleotide-primed PCR) is a form of PCR
which is deliberately designed to permit possible amplification of several
products. The two primers may be
partially degenerate
oligonucleotides, composed of panels of oligonucleotide sequences that have the
same base at certain nucleotide positions, but are different at others. As a
result, there may be comparatively many
primer binding sites in the source DNA.
This provides a means of searching for a new or uncharacterized DNA sequence
that belongs to a family of related sequences either within or between species.
Note: the use of such primers also provides a way of
cloning a gene when only a limited portion of amino acid sequence is known for
the product ().
DOP-PCR can also be used to permit comparatively indiscriminate amplification of
target DNA. Primer sequences with random sequences can bind to numerous
locations in the template DNA and permit a form of whole-genome amplification (Zhang et al., 1992; Cheung and Nelson, 1996). This can be advantageous where
the amount of starting DNA may be limiting (as in the case of extracts from
ancient DNA samples, microdissected chromosome bands, single cell typing, etc.),
and PCR amplification of essentially all sequences increases the amount of DNA
for study.
Linker-primed PCR (ligation adaptor PCR)
Figure 6.12
.
Linker-primed PCR permits indiscriminate amplification of DNA
sequences in a complex target DNA
The linker (adaptor) molecule is a double-stranded
oligonucleotide formed by ligating two single-stranded
oligonucleotides which are complementary in sequence except that
one possesses a 5′ overhang compatible with a
restriction nuclease overhang (in this case, the 5′
GATC overhang produced by MboI). After ligation
of the linker to the target restriction fragments, a
linker-specific primer can result in amplification of all
fragments by binding to two flanking linker molecules.
Another way of enabling amplification of essentially all DNA sequences in a
complex DNA mixture involves first ligating a known sequence to all
fragments. To do this, the
target DNA population is digested with a suitable
restriction endonuclease, and double-stranded oligonucleotide
linkers (also called
adaptors) with a suitable overhanging end are ligated to
the ends of
target DNA fragments. Amplification is then performed using
oligonucleotide primers which are specific for the linker sequences. In this
way, all fragments of the DNA source which are flanked by linker
oligonucleotides can be amplified ().
6.2.5. Anchored PCR uses a target-specific primer and a universal primer for
amplifying sequences adjacent to a known sequence
Figure 6.13
.
‘Genome walking’ by anchored PCR
The target may be a complex source of DNA comprised of many fragments
to which an anchor sequence is attached, for example a
double-stranded oligonucleotide linker. The idea is to use a primer
specific for the anchor sequence and one specific for a known
sequence X to be able to rescue fragments containing sequence X and
so gain access to previously unidentified sequences adjacent to X.
In this example the anchored sequence is shown only on the left hand
side for clarity and permits amplification of the previously
characterized N1 sequence adjacent to known sequence X. A variety of
derivative methods have been devised, such as bubble-linker PCR
(Figure 10.16).
It is often desirable to be able to amplify previously uncharacterized DNA
sequences that neighbor a known DNA sequence, either at the genomic or cDNA
level. To do this a form of
anchored
PCR is used (see ).
One of the primers is specific for the target sequence and the second
primer is
specific for a common sequence that can be introduced in different ways, such as
by using a linker-
primer method as described in the previous section, or by
using primers that are modified at the 5′ end so as to introduce a
novel sequence.
6.3. DNA sequencing
6.3.1. DNA sequencing usually involves enzymatic DNA synthesis in the presence of
base-specific dideoxynucleotide chain terminators
Figure 6.14
.
A universal sequencing primer can be used to sequence many
different template DNAs
DNA templates for DNA sequencing are often single-stranded
recombinant DNA molecules. Different clones will often contain
different inserts within the same vector molecule. As a result, a
universal sequencing primer (P) can be designed to be complementary
to a short vector sequence located next to the cloning site(s),
allowing sequencing of different insert DNAs.
Formerly, chemical DNA sequencing methods were often employed, using
base-specific chemical modification and subsequent cleavage of the DNA.
Currently, however, the vast majority of DNA sequencing is carried out using an
enzymatic method: the DNA to be sequenced is provided in a single-stranded form
from which DNA polymerase synthesizes new complementary DNA strands. Usually,
the single-stranded DNA template is obtained using a cloning system which
permits recovery of single-stranded
recombinant DNA, as with M13 or phagemid
cloning systems (
Section 4.4.1 and
Figure 4.17). The subsequent DNA sequencing
reactions involve DNA synthesis using one or more labeled nucleotides and a
universal sequencing
primer that is complementary to the
vector sequence
flanking the cloning site ().
Figure 6.15
.
Structure of a dideoxynucleotide, 2′, 3′
dideoxy CTP
Note that the hydroxyl group which is attached to
carbon 3′ in normal nucleotides (see Figure 1.2) is replaced by a hydrogen atom.
In addition to the normal nucleotide precursors, DNA synthesis is carried out in
the presence of base-specific
dideoxynucleotides (
ddNTPs). The latter are analogs of
the normal dNTPs but differ in that they lack a hydroxyl group at the
3′ carbon position as well as the 2′ carbon (). A dideoxynucleotide can be
incorporated into the growing DNA chain by forming a phosphodiester bond between
its 5′ carbon atom and the 3′ carbon of the previously
incorporated nucleotide. However, since ddNTPs lack a 3′ hydroxyl
group, any ddNTP that is incorporated into a growing DNA chain cannot
participate in phosphodiester bonding at its 3′ carbon atom, thereby
causing abrupt termination of chain synthesis.
Figure 6.16
.
Dideoxy DNA sequencing relies on synthesizing new DNA strands
from a single-stranded DNA template and random incorporation of a
base-specific dideoxynucleotide to terminate chain synthesis
(A) Principle of dideoxy sequencing. The sequencing
primer binds specifically to a region 3′ of the desired
DNA sequence and primes synthesis of a complementary DNA strand in
the indicated direction. Four parallel base-specific reactions are
carried out, each with all four dNTPs and with one ddNTP.
Competition for incorporation into the growing DNA chain between a
ddNTP and its normal dNTP analog results in a population of
fragments of different lengths. The fragments will have a common
5′ end (defined by the sequencing primer) but variable
3′ ends, depending on where a dideoxynucleotide (shown
with a filled circle above) has been inserted. For example, in the
A-specific reaction chain, extension occurs until a ddA nucleotide
(shown as A with a filled black circle above) is incorporated. This
will lead to a population of DNA fragments of lengths
n + 2, n + 5,
n + 13, n + 16 nucleotides,
etc. (B) Conventional DNA sequencing. This generally
involves using a radioactively labeled nucleotide and
size-fractionation of the products of the four reactions in separate
wells of a polyacrylamide gel. The dried gel is submitted to
autoradiography, allowing the sequence of the complementary strand
to be read (from bottom to top). The bottom panel illustrates a
practical example, in this case a sequence within the gene for type
II neurofibromatosis.
Four parallel base-specific reactions are conducted using a mix of all four dNTPs
and also a small proportion of one of the four ddNTPs. By setting the
concentration of the ddNTP to be very much lower than that of its normal dNTP
analog, chain termination will occur randomly at one of the many positions
containing the base in question. Each reaction is therefore a
partial
reaction: chain termination occurs randomly at one of the possible
bases
in any one DNA strand. However, the DNA to be sequenced
in a DNA sequencing reaction is a
population of (usually)
identical molecules. As a result, each one of the four base-specific reactions
will generate
a collection of labeled DNA fragments of different
sizes, with
a common 5′
end but
variable 3′
ends (the common
5′ end is defined by the sequencing
primer and the 3′ ends
which terminate with the chosen ddNTP are variable because the insertion of the
dideoxynucleotide occurs randomly at one of the many different positions that
will accept that specific base - ).
Fragments that differ in size by even a single nucleotide can be separated on a
denaturing polyacrylamide gel. The differently sized fragments can be detected
by incorporating labeled groups into the reaction products, either by
incorporating labeled nucleotides or by using a
primer with a labeled group. The
sequence can then be read off by reading from the bottom of the gel to the top,
a direction that gives the 5′ → 3′ sequence of the
complementary strand of the provided DNA template (see ).
6.3.2. DNA sequencing is increasingly being conducted using fluorescent labeling
systems and automated detection systems
Traditional dideoxy sequencing methods have employed radioisotope labeling: the
dNTP mix contains a proportion of radiolabeled nucleotides which are
incorporated within the growing DNA chains. Following electrophoresis, the gel
is dried and an autoradiographic film is placed in contact with the dried gel.
After a suitable exposure time, the film is developed, giving a characteristic
pattern of dark bands ().
32P-labeled nucleotides are not very suitable for this purpose:
the high energy β-radiation causes considerable scattering of the
signal, leading to diffuse bands. Instead,
35S- or
33P-labeled nucleotides have been used.
Figure 6.17
.
Automated DNA sequencing using fluorescent primers
(A) Principles of automated DNA sequencing. Automated
DNA sequencing involves loading all four reaction products into
single lanes of the electrophoresis gel and capture of sequence data
during the electrophoresis run. Four separate fluorescent dyes are
used as labels for the base-specific reactions (the label can be
incorporated by being attached to a base-specific ddNTP, or by being
attached to the primer and having four sets of primers corresponding
to the four reactions). During the electrophoresis run, a laser beam
is focused at a specific constant position on the gel. As the
individual DNA fragments migrate past this position, the laser
causes the dyes to fluoresce. Maximum fluorescence occurs at
different wavelengths for the four dyes, and the information is
recorded electronically and the interpreted sequence is stored in a
computer database. (B) Example of DNA sequence output.
This shows a typical output of sequence data from an AB1377
automated DNA sequencer as a succession of dye-specific (and
therefore base-specific) intensity profiles. The example illustrated
represents sequencing of the end of a BAC clone from chromosome
3q26.3. Data provided by Dr Emma Tonkin, University of Newcastle
upon Tyne. Figure kindly sponsored by PE Biosystems, a PE
Corporation Business.
Large-scale DNA sequencing efforts are dependent on improving efficiency by
partial automation of the technologies involved. One major improvement in recent
years has been the development of automated procedures for fluorescent DNA
sequencing (
Wilson et al.,
1990). These procedures generally use primers or dideoxynucleotides
to which are attached fluorophores (chemical groups capable of fluorescing - see
Section 5.1.2). During
electrophoresis, a monitor detects and records the
fluorescence signal as the
DNA passes through a fixed point in the gel (). The use of different fluorophores in the four
base-specific reactions means that, unlike conventional DNA sequencing, all four
reactions can be loaded into a single lane. The output is in the form of
intensity profiles for each of the differently colored fluorophores (), but the information is
simultaneously stored electronically. This precludes
transcription errors when
an interpreted sequence is typed by hand into a computer file. Recent advances
in technology mean that the accuracy of DNA sequencing using automated methods
is acceptably high.
6.3.3. PCR-amplified products are often used for DNA sequencing
Cycle sequencing
Double-stranded DNA templates can be used in standard dideoxy sequencing by
denaturing the DNA prior to binding the oligonucleotide
primer. However, the
quality of sequences from initially double-stranded DNA templates is often
poor. Cycle sequencing, also called linear amplification sequencing, is a
kind of PCR sequencing approach which overcomes this problem. Like the
standard PCR reaction, it uses a thermostable DNA polymerase and a
temperature cycling format of
denaturation,
annealing and DNA synthesis. The
difference is that cycle sequencing employs only one
primer and includes a
ddNTP chain terminator in the reaction. The use of only a single
primer
means that unlike the exponential increase in product during standard PCR
reactions, the product accumulates
linearly (see ). Because the product
accumulates during the reaction, and because of the high temperature at
which the sequencing reactions are carried out, and the multiple heat
denaturation steps, small amounts of double-stranded plasmids, cosmids,
λDNA and PCR products may be sequenced reliably without a separate
heat
denaturation step.
6.3.4. DNA microarray technology permits an alternative approach to DNA
sequencing
DNA sequencing can be accomplished by hybridization of the target DNA to a series
of oligonucleotides of known sequence, usually about 7–8 nucleotides
long. If the hybridization conditions are specific, it is possible to check
which oligonucleotides are positive by hybridization, feed the results into a
computer and use a program to look for sequence overlaps in order to establish
the required DNA sequence. DNA microarrays have permitted sequencing by
hybridization to oligonucleotides on a large scale (Southern, 1996) and in a test system, the sequence of
human mtDNA previously first determined in 1981 was recently re-sequenced by DNA
microarray hybridization. This type of technology is increasing in importance
for assessing sequence variation over at least modest lengths of DNA and
diagnostic applications in mutation analysis are proliferating (Hacia, 1999; Section 17.1.4).
6.4. In vitro site-specific mutagenesis
Mutagenesis is a fundamentally important DNA technology which seeks to change the
base sequence of DNA and test its effect on gene or DNA function. The mutagenesis
can be conducted in vivo (in studies of model organisms, or
cultured cells) or in vitro and the mutagenesis can be directed to
a specific site in a pre-determined way (site-directed mutagenesis), or can be random. In the case of in
vivo mutagenesis, for example, gene targeting offers exquisite
site-directed mutagenesis within living cells (Section 21.3.1) while exposure of male mice to high levels of a powerful
mutagen such as ethyl nitrosurea (ENU) and subsequent mating of the mice offers a
form of random mutagenesis which can be important in generating new mutants (Section 21.4.1).
In vitro mutagenesis can involve essentially random approaches to
mutagenesis, which may be valuable in producing libraries of new mutants. In
addition, if a gene has been cloned and a functional assay of the product is
available, it is also very useful to be able to employ a form of in
vitro mutagenesis which results in alteration of a specific amino acid
or small component of the gene product in a predetermined way.
6.4.1. Oligonucleotide mismatch mutagenesis is a popular method of introducing a
predetermined single nucleotide change into a cloned gene
Figure 6.19
.
Oligonucleotide mismatch mutagenesis can create a desired point
mutation at a unique predetermined site within a cloned DNA
molecule
The figure illustrates only one of many different methods of
cell-based oligonucleotide mismatch mutagenesis (for alternative
PCR-based
site-directed mutagenesis, see
Section 6.4.2). The example illustrates the
use of a mutagenic oligonucleotide to direct a single nucleotide
substitution in a gene. The gene is cloned into M13 in order to
generate a single-stranded
recombinant DNA (
Section 4.4.1). An oligonucleotide
primer is
designed to be complementary in sequence to a portion of the gene
sequence encompassing the nucleotide to be mutated (A) and
containing the desired noncomplementary base at that position (C,
not T). Despite the internal mismatch,
annealing of the mutagenic
primer is possible, and second strand synthesis can be extended by
DNA polymerase and the gap sealed by
DNA ligase. The resulting
heteroduplex can be transformed into
E. coli,
whereupon two populations of recombinants can be recovered:
wild-type and mutant homoduplexes. The latter can be identified by
molecular hybridization (by using the mutagenic
primer as an
allele-specific oligonucleotide
probe; see
Figure 5.11) or by PCR-based
allele-specific
amplification methods (see ).
Many
in vitro assays of gene function wish to gain information
on the importance of individual amino acids in the encoded polypeptide. This may
be relevant when attempting to assess whether a particular
missense mutation
found in a known disease gene is pathogenic, or just generally in trying to
evaluate the contribution of a specific amino acid to the biological function of
a protein. A popular general approach involves cloning the gene or cDNA into an
M13 or phagemid
vector which permits recovery of single-stranded
recombinant DNA
(
Section 4.4.1). A mutagenic
oligonucleotide
primer is then designed whose sequence is perfectly
complementary to the gene sequence in the region to be mutated, but with a
single difference: at the intended mutation site it bears a base that is
complementary to the desired mutant nucleotide rather than the original. The
mutagenic oligonucleotide is then allowed to prime new DNA synthesis to create a
complementary full-length sequence containing the desired mutation. The newly
formed
heteroduplex is used to transform cells, and the desired mutant genes can
be identified by screening for the mutation (see ).
Other small-scale mutations can also be introduced in addition to single
nucleotide substitutions. For example, it is possible to introduce a
three-nucleotide deletion that will result in removal of a single amino acid
from the encoded polypeptide, or an insertion that adds a new amino acid.
Provided the mutagenic oligonucleotide is long enough, it will be able to bind
specifically to the gene template even if there is a considerable central
mismatch. Still larger mutations can be introduced by using cassette mutagenesis
in which case a specific region of the original sequence of the original gene is
deleted and replaced by oligonucleotide cassettes (Bedwell et al., 1989).
6.4.2. PCR can be used to couple desired sequences or chemical groups to a target
sequence and to produce specific pre-determined mutations in DNA
sequences
In addition to long-established nonPCR based methods, site-directed mutagenesis
by PCR has become increasingly popular and various strategies have been devised
to enable base substitutions, deletions and insertions (see below and Newton and Graham, 1997). In addition to
producing specific predetermined mutations in a target DNA, a form of
mutagenesis known as 5′ add-on mutagenesis permits addition of a
desired sequence or chemical group in much the same way as can be achieved using
ligation of oligonucleotide linkers (see Box 4.2).
5′ Add-on mutagenesis
This is a commonly used practice in which a new sequence or chemical group is
added to the 5′ end of a PCR product by designing primers which
have the desired specific sequence for the 3′ part of the
primer
while the 5′ part of the
primer contains the novel sequence or a
sequence with an attached chemical group. The extra 5′ sequence
does not participate in the first
annealing step of the PCR reaction (only
the 3′ part of the
primer is specific for the target sequence),
but it subsequently becomes incorporated into the amplified product, thereby
generating a
recombinant product (). Various popular alternatives for the extra 5′
sequence include: (i) a suitable
restriction site which may facilitate
subsequent cell-based DNA cloning; (ii) a functional component, e.g. a
promoter sequence for driving expression (see
Figure 17.9 for an example); a modified nucleotide
containing a reporter group or labeled group, such as a biotinylated
nucleotide (see
Figure 10.24 for an
example) or
fluorophore.
Mismatched primer mutagenesis
The
primer is designed to be only partially complementary to the target site
but in such a way that it will still bind specifically to the target.
Inevitably this means that the mutation is introduced close to the extreme
end of the PCR product. As described in
Section 6.2.3 this approach may be exploited to introduce an
artificial diagnostic
restriction site that permits screening for a known
mutation. Mutations can also be introduced at any point within a chosen
sequence using mismatched primers. Two mutagenic reactions are designed in
which the two separate PCR products have partially overlapping sequences
containing the mutation. The denatured products are combined to generate a
larger product with the mutation in a more central location (
Higuchi, 1990; ).