• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of amjpatholAmerican Journal of Pathology For AuthorsAmerican Journal of Pathology SubscribeAmerican Journal of Pathology SearchAmerican Journal of Pathology Current IssueAmerican Journal of Pathology About the JournalAmerican Journal of Pathology
Am J Pathol. Nov 1999; 155(5): 1467–1471.
PMCID: PMC1866966

A High Frequency of Sequence Alterations Is Due to Formalin Fixation of Archival Specimens


Genomic analysis of archival tissues fixed in formalin is of fundamental importance in biomedical research, and numerous studies have used such material. Although the possibility of polymerase chain reaction (PCR)-introduced artifacts is known, the use of direct sequencing has been thought to overcome such problems. Here we report the results from a controlled study, performed in parallel on frozen and formalin-fixed material, where a high frequency of nonreproducible sequence alterations was detected with the use of formalin-fixed tissues. Defined numbers of well-characterized tumor cells were amplified and analyzed by direct DNA sequencing. No nonreproducible sequence alterations were found in frozen tissues. In formalin-fixed material up to one mutation artifact per 500 bases was recorded. The chance of such artificial mutations in formalin-fixed material was inversely correlated with the number of cells used in the PCR—the fewer cells, the more artifacts. A total of 28 artificial mutations were recorded, of which 27 were C-T or G-A transitions. Through confirmational sequencing of independent amplification products artifacts can be distinguished from true mutations. However, because this problem was not acknowledged earlier, the presence of artifacts may have profoundly influenced previously reported mutations in formalin-fixed material, including those inserted into mutation databases.

Analysis of nucleic acids from paraffin-embedded tissue blocks is crucial in today’s clinical research. It is known that the formalin fixation procedure lowers the success of polymerase chain reaction (PCR) amplification 1 because of cross-linking between protein and DNA. 2 Nevertheless, a great number of reports based on formalin-fixed paraffin-embedded tissues used for amplification and subsequent analysis have been published, and the results have been incorporated into databases. Use of the PCR has permitted the analysis of decreasing amounts of template, allowing genetic analysis of single cells in tissue sections. 3 The use of amplification techniques makes the analysis vulnerable for several reasons. Randomly scattered nucleotide substitutions due to misincorporation by the Taq DNA polymerase 4 are observed after cloning of PCR products and are well documented. 4,5 Direct sequencing of the amplified PCR product theoretically overcomes this problem, because the effect of such randomly distributed mutations should be masked by the consensus sequence. Mutations detected by direct sequencing are therefore generally considered as true, especially when nonambiguous, and the need for independent confirmation (starting from new amplification of the original sample lysate) may be overlooked. Nevertheless, we have previously noted a disturbing occurrence of nonreproducible mutations in studies involving amplification and direct DNA sequence analysis of the p53 gene in formalin-fixed samples of lung, breast, bladder, and skin cancer (data not published). To determine the exact presence and frequency of these artifacts we compared PCR amplification and direct sequencing analysis of frozen and formalin-fixed parallel tumor tissue, from one well-characterized tumor, under controlled conditions.

Materials and Methods

Sample Preparation

Clinical samples from a basal cell cancer containing known mutations 6 were used in this study. Biopsies were sliced immediately after excision; one part was snap-frozen and cryosectioned, and the other part was fixed in formalin and paraffin embedded. The 12–16-μm-thick sections were microdissected with a small scalpel (Alcon Ophthalmic knife 15°). The number of microdissected cells was estimated at a minimum of 1500 for the frozen sample and 2000 for the formalin-fixed sample. The number of microdissected cells available per PCR is based on this first estimation. The samples were transferred to tubes containing 50 μl PCR buffer (10 mM Tris-HCl (pH 8.3), 50 mM KCl). Cells were lysed by the addition of 2 μl freshly prepared proteinase K solution (25 mg/ml, dissolved in redistilled water) at 56°C for 1 hour, incubated with 0.5 volume Chelex slurry (1:1 w/v Chelex 100 resin/redistilled water) for 10 minutes at room temperature, followed by heat inactivation (95°C for 5 minutes). The mixture was centrifuged (5000 rpm for 5 minutes) and carefully removed by aspiration to a clean microcentrifuge tube. Dilution series were made to correspond to 300 to 10 cells per 2 μl.

PCR Amplification

Aliquots of the different dilutions were amplified into six shorter fragments in an outer multiplex PCR (covering 900 bp of exons 4–9 of the p53 gene), followed by inner specific PCRs for each exon. This technique 7 has been developed especially to facilitate analysis of small samples, down to a single microdissected cell. 3 The outer amplification was performed for 35 cycles, using AmpliTaq and Stoffel Fragment AmpliTaq polymerases (Perkin-Elmer, Norwalk, CT). After dilution (25-fold for exons 4, 5, and 7–9 and 100-fold for exon 6), inner region specific amplifications for exons 4–9 were performed (35 cycles). One of the inner primers for each fragment was labeled with biotin to permit solid-phase sequencing of PCR templates. Several PCR amplifications were made for each dilution.

Sequence Analysis

Solid-phase direct DNA sequencing was essentially performed, according to the methods described in refs. 7 and 8, with the use of Streptavidin-coated combs (AutoLoad Solid Phase Sequencing Kit; Amersham Pharmacia Biotech, Uppsala, Sweden) and automated laser fluorescent analysis (ALFExpress; Amersham Pharmacia Biotech). A total of 3600 bases (four repeats of 900 bases) per dilution were analyzed, except for the highest dilutions of formalin-fixed samples, where 4300 bases covering exons 5–9 were analyzed (because exon 4 did not amplify).

Recording of Sequence Alterations

The reference basal cell cancer was known to harbor two mutations (codons 130 and 285) in all of its parts. 6 These were here considered elements of the “correct” prototype nucleotide sequence. The sequences recorded (from dilutions of frozen and formalin-fixed parts of the tumor) were compared to the prototype sequence for the detection of additional alterations. All dilutions originated from the same microdissected frozen or formalin sample, and each dilution was amplified in several different outer PCRs. Additional alterations were considered artifacts when they did not appear in amplicons of different outer PCRs. All artifacts could be “confirmed,” however, by repeated analyses of amplicons of the same outer PCR product (by a new inner PCR and sequencing). The ratios between the total number of confirmed artifacts and number of bases sequenced for the respective dilutions were tabulated (Table 1) [triangle] . The detection limit for mutations in this assay requires at least 20% of the amplified product to harbor the alteration.

Table 1.
Amplification Efficiency and Frequency of Artificial Mutations in Relation to Tissue Material and Number of Cells per Analysis


PCR Amplification

All dilutions of the frozen cells (200, 64, 20, and 10 cells per PCR) were amplified successfully for all exons (Table 1) [triangle] . For formalin-fixed cells the higher dilutions (10 and 20 cells per PCR) did not amplify exon 4 (which is the longest fragment, 350 bp), whereas dilutions corresponding to 40, 80, 150, and 300 cells amplified all exons (Table 1) [triangle] . To control the accuracy of cell concentrations in the dilutions, additional amplifications of microdissected samples with the exact number of cells known were performed. These experiments confirmed the results above.

Sequence Analysis

Among the frozen samples, no sequence alterations other than the two known mutations were detected in any of the dilutions (from 200 cells to 10 cells per PCR). Among formalin-fixed samples, a number of nonreproducible sequence alterations (ie, artificial mutations) appeared. The higher dilutions, 10 and 20 cells per PCR, showed one nonreproducible mutation for every 500 bases, whereas 40, 80, and 150 cells showed a lower but still important error rate. The results are shown in Table 1 [triangle] . The known mutations in this tumor were always present, confirming the origin of template. Additional sequence alterations, however, were found only in the formalin-fixed material and could not in any case be confirmed by repeated analysis (starting from the original sample lysate dilution), as exemplified in Figure 1 [triangle] . Independent amplifications from the same pool of formalin-fixed cells could show several different artificial mutations. In total, 28 artificial mutations were recorded in the formalin-fixed part of the tumor, 27 (96%) occurred at guanosine or cytosine positions and resulted in C-T or G-A transitions, and the remaining one was an A-T transversion. Eight artificial mutations (28%) were silent or were intron alterations, and 20 (72%) coded for missense or nonsense alterations.

Figure 1.
Direct sequencing data of exon 5 of the p53 gene from two independent PCR amplifications, each using 10 cells from, respectively, a frozen (top) and a formalin-fixed (bottom) part of the same tumor. The frozen sample displays the wild-type sequence, whereas ...


This study shows that as much as one artificial mutation per 500 bases may be recorded in the analysis of formalin-fixed material. Approximately one-third of the artificial mutations coded for a silent amino acid change. Such a mutation spectrum is expected if the mutations are distributed randomly, without biological selection. Silent mutations have not been observed by us before, in this or in previous studies of frozen skin tumors, 6,9,10 including an extensive analysis of a xeroderma pigmentosum patient, in which we recorded 29 different mutations in various lesions. 11

Although the artificial mutations could never be confirmed by repeated analysis starting from the original sample lysate dilution, a repeated inner PCR performed on the same outer PCR showed the same artifact when sequenced. This suggests that the artifacts occur in or before the outer PCR and thus are not related to the sequencing procedure.

For an error to show up as a detectable sequence alteration (in direct sequence analysis) it is required to occur in the first cycle of outer amplification, in the presence of very few templates. When only one template is present (ie, only one strand of DNA) and an error occurs in the first cycle, the theoretical amount of mutated fragments should not be more than 50% of the final amplification product, assuming that the original template is amplified correctly in the second cycle. When one cell, which contains four templates of DNA (two strands on two alleles), is subjected to amplification the fragments containing artifacts should not make up more than 12.5%. This would place them below or, at best, at the limit of detection in our sequencing method. In this study, where 10 cells per PCR was the lowest number of cells used, an error should not be detectable. Nevertheless, 28 artificial mutations were detected, and, as exemplified in Figure 1 [triangle] , the fragments containing the artifact often made up approximately 50% of the sequence (which corresponds to half of the final product of amplified DNA). In addition, a few amplifications contained only the error sequence (as determined by a 100% mutant DNA sequencing signal; data not shown). Our conclusion is that only one or a few of the theoretical templates were truly available for amplification.

The exact mechanism for modification of DNA in formalin-fixed samples is not known. DNase activity is not believed to be the cause. 12 The rate of errors detected in the formalin-fixed material is much higher than the reported Taq DNA polymerase error frequency (2/10 5 to 1/9 × 103). 13,14 Artifacts could be the consequence of formalin damaging or cross-linking cytosine nucleotides, on either strand, so that the Taq DNA polymerase would not recognize them and instead of a guanosine incorporate an adenosine (because of the so-called A-rule). Thereby an artificial C-T or G-A mutation would be created. In addition, damaged DNA has been described to promote jumping between templates during enzymatic amplification. 15 According to that theory, Taq DNA polymerase may insert an adenosine residue when it encounters the end of a template molecule (the same A-rule as above), then jump to another template and continue the extension. As a result, an artificial mutation may be produced and amplified. The actual frequency of errors would correspond, in addition to the Taq DNA polymerase’s normal error frequency, to the degree of damage and/or cross-linking of DNA. The detected frequency of artificial mutations, however, would also depend on the degree of “dilution” by correctly amplified fragments and, thereby, on the number of target templates in the first round of amplification. This corresponds well to the increase in artifacts we observed when fewer cells were used in the outer PCR. At a higher number of target cells (>300 in our study) there were enough nondamaged templates to dominate the amplification process (Figure 2A) [triangle] . For smaller amounts of cells only fragmented DNA may be present, requiring a few PCR cycles to achieve an in vitro repaired template that would yield an exponential amplification. The artifact mutations may then represent errors in the early repair process (Figure 2B) [triangle] , by, for example, the non-template-dependent addition of an A residue. This interpretation is supported by the finding of mutation signals on the order of 50–100% peak signals, indicating amplification of a single DNA copy.

Figure 2.
Schematic view of a plausible explanation for formalin-mediated artifacts. A: The results when intact target gene sequences (contiguous line), present in the original sample, quantitatively dominate the amplification process (bold line). In vitro repaired ...

In a former study, 16 formalin-fixed lung cancers were analyzed for p53 mutations by both direct sequencing and single-strand conformation polymorphism (SSCP). In 50 tumors, 13 true mutations and 47 artificial mutations were found (a frequency of one artifact per 606 bases). When direct sequencing was used, the artifacts were easily distinguished from true mutations by confirmatory sequencing of independent PCR products. With the use of SSCP, many samples with artificial mutations also showed shifts in the confirmatory analysis. DNA sequencing of both shifts was needed to rule out artifacts (where the two shifts exhibited different nucleotide changes). The normal procedure is to consider the mutation confirmed if a shift appears in two separate runs. However, with an error rate of one artifact per 500 bases, a sample may have artifacts in two separate PCRs, although different ones. Hence artificial mutations may pose a problem in direct DNA-sequencing strategies and in many other molecular techniques (eg, SSCP, denaturant gradient gel electrophoresis) that are based on PCR amplification. In addition, we have noted a high frequency of artificial mutations in other studies of (formalin-fixed) cancers of different origins (data not published). When the Taq DNA polymerase was used, the frequency of artifacts was one per 683 bases in a study of endocrine samples and one per 821 bases in a study of breast tumor samples. Furthermore, one study of basal cell cancer samples was performed with the Pfu DNA polymerase, where, in amplified samples, the error frequency was one per 2050 bases (data not published). Because samples were collected from different pathology laboratories and different preparations of DNA template were used (with and without extraction, chelating agents, and proteinase K treatment), the artifacts do not seem to be dependent on any specific routine or treatment of the samples (other than the use of formalin fixation). As a result of this, an unknown number of incorrect mutations may have been reported in various studies and inserted into various databases when DNA from tissues fixated in formalin was analyzed. A significant part of the data in mutation databases is based on analysis of formalin-fixed material. An example of this is the IARC Database of somatic p53 mutations (http://www.iarc.fr/p53/homepage.htm), where 38% of reported somatic mutations (with information of origin) are from formalin-fixed tumors (Dr. T. Hernandez-Boussard, personal communication).

In conclusion, this study has highlighted concerns that need to be dealt with when formalin-fixed archival specimens are used. Although PCR amplification and subsequent analysis appear successful, artificial mutations can be present at a high frequency. Thus our results emphasize the importance of confirmation from the biological source, which resolves the problem with artificial mutations.


We are grateful to Dr. Jacob Odeberg for valuable comments on the manuscript.


Address reprint requests to Joakim Lundeberg, Department of Biotechnology, KTH Royal Institute of Technology, Teknikringen 30, S-100 44 Stockholm, Sweden. E-mail: .es.htk.mehcoib@grebednul.mikaoj

Supported by grants from The Swedish Foundation for Strategic Research and The Swedish Cancer Foundation.


1. Ben-Ezra J, Johnson DA, Rossi J, Cook N, Wu AJ: Effect of fixation on the amplification of nucleic acids from paraffin-embedded material by the polymerase chain reaction. J Histochem Cytochem 1991, 39:351-354 [PubMed]
2. Chalkley R, Hunter C: Histone-histone propinquity by aldehyde fixation of chromatin. Proc Natl Acad Sci USA 1975, 72:1304-1308 [PMC free article] [PubMed]
3. Pontén F, Williams C, Ling G, Ahmadian A, Nistér M, Lundeberg J, Pontén J, Uhlén M: Genomic analysis of single cells from human basal cell cancer using laser-assisted capture microscopy. Mutat Res 1997, 382:45-55 [PubMed]
4. Eckert KA, Kunkel TA: DNA polymerase fidelity and the polymerase chain reaction. PCR Methods Appl 1991, 1:17-24 [PubMed]
5. Hultman T, Bergh S, Moks T, Uhlén M: Bidirectional solid-phase sequencing of in vitro-amplified plasmid DNA. Biotechniques 1991, 10:84-93 [PubMed]
6. Pontén F, Berg C, Ahmadian A, Ren ZP, Nistér M, Lundeberg J, Uhlén M, Pontén J: Molecular pathology in basal cell cancer with p53 as a genetic marker. Oncogene 1997, 15:1059-1067 [PubMed]
7. Berg C, Hedrum A, Holmberg A, Pontén F, Uhlén M, Lundeberg J: Direct solid-phase sequence analysis of the human p53 gene by use of multiplex polymerase chain reaction and α-thiotriphosphate nucleotides. Clin Chem 1995, 41:1461-1466 [PubMed]
8. Hultman T, Ståhl S, Hornes E, Uhlén M: Direct solid phase sequencing of genomic and plasmid DNA using magnetic beads as solid support. Nucleic Acids Res 1989, 17:4937-4946 [PMC free article] [PubMed]
9. Ren ZP, Hedrum A, Pontén F, Nistér M, Ahmadian A, Lundeberg J, Uhlén M, Pontén J: Human epidermial cancer and accompanying precursors have identical p53 mutations different from p53 mutations in adjacent areas of clonally expanded non-neoplastic keratinocytes. Oncogene 1996, 12:765-773 [PubMed]
10. Ren ZP, Ahmadian A, Pontén F, Nistér M, Berg C, Lundeberg J, Uhlén M, Pontén J: Benign clonal keratinocyte patches with p53 mutations show no genetic link to synchronous squamous cell precancer or cancer in human skin. Am J Pathol 1997, 150:1791-1803 [PMC free article] [PubMed]
11. Williams C, Pontén F, Ahmadian A, Ren ZP, Gao L, Rollman O, Ljung A, Jaspers NGJ, Uhlén M, Lundeberg J, Pontén J: Clones of normal keratinocytes and a variety of simultaneously present epidermal neoplastic lesions contain a multitude of p53 gene mutations in a xeroderma pigmentosum patient. Cancer Res 1998, 58:2449-2455 [PubMed]
12. Yagi N, Satonaka K, Horio M, Shimogaki H, Tokuda Y, Maeda S: The role of DNase and EDTA on DNA degradation in formaldehyde fixed tissues. Biotech Histochem 1996, 71:123-129 [PubMed]
13. Lundberg KS, Shoemaker DD, Adams MW, Short JM, Sorge JA, Mathur EJ: High-fidelity amplification using a thermostable DNA polymerase isolated from Pyrococcus furiosus. Gene 1991, 108:1-6 [PubMed]
14. Tindall KR, Kunkel TA: Fidelity of DNA synthesis by the Thermus aquaticus DNA polymerase. Biochemistry 1988, 27:6008-6013 [PubMed]
15. Pääbo S, Irwin DM, Wilson AC: DNA damage promotes jumping between templates during enzymatic amplification. J Biol Chem 1990, 265:4718-4721 [PubMed]
16. Yngveson A, Williams C, Hjerpe A, Lundeberg J, Söderkvist P, Pershagen G: p53 mutations in lungcancer associated with residential radon exposure. Cancer Epidemiol Biomarkers Prev 1999, 8:433-438 [PubMed]

Articles from The American Journal of Pathology are provided here courtesy of American Society for Investigative Pathology
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Compound
    PubChem Compound links
  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...