• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of jmdCurrent IssueAuthorsSubscriptionsSearchAboutJMD
J Mol Diagn. Jul 2010; 12(4): 409–417.
PMCID: PMC2893624

Quantitative Expression Profiling in Formalin-Fixed Paraffin-Embedded Samples by Affymetrix Microarrays


To date, few studies have systematically characterized microarray gene expression signal performance with degraded RNA from fixed (FFPE) in comparison with intact RNA from unfixed fresh-frozen (FF) specimens. RNA was extracted and isolated from paired tumor and normal samples from both FFPE and FF kidney, lung, and colon tissue specimens and microarray signal dynamics on both the raw probe and probeset level were evaluated. A contrast metric was developed to directly compare microarray signal derived from RNA extracted from matched FFPE and FF specimens. Gene-level summaries were then compared to determine the degree of overlap in expression profiles. RNA extracted from FFPE material was more degraded and fragmented than FF, resulting in a reduced dynamic range of expression signal. In addition, probe performance was not affected uniformly and declined sharply toward 5′ end of genes. The most significant differences in FFPE versus FF signal were consistent across three tissue types and enriched with ribosomal genes. Our results show that archived FFPE samples can be used to profile for expression signatures and assess differential expression similar to unfixed tissue sources. This study provides guidelines for application of these methods in the discovery, validation, and clinical application of microarray expression profiling with FFPE material.

Over the past decade, analysis of genome-wide patterns of gene expression using oligonucleotide microarrays has proven itself as a powerful discovery tool in basic and translational research. Recently, this technology has been leveraged to develop gene expression–based tests designed to aid in the treatment and management of patients with diseases such as cancer.1,2,3,4 These tests measure the expression of a number of genes, referred to as an ‘expression signature’ of a clinical or pathological state and mathematical algorithms to derive patient scores. Traditionally, the discovery of gene expression signatures relevant for the development of clinical tests have relied on microgram quantities of high-quality RNA such as can be extracted from fresh or fresh-frozen (FF) tissue specimens.3,4,5 Subsequent studies are inevitably required to adopt expression signatures discovered in FF material6 to clinical specimens.7

Formalin-fixation and paraffin-embedding (FFPE) of clinical tissue specimens remains the standard method used to preserve tissue morphology for pathological diagnosis and sample archiving. Preserved samples, meticulously collected and preserved through many decades of work, are extremely rich sources of study material. A significant barrier to the development of such cancer tests is the fact that few biorepositories exist with sufficient numbers of FF samples that have annotated long-term follow-up data suitable for the design of retrospective clinical trials.8 However, working with RNA derived from FFPE specimens is confounded by several factors, such as variability in tissue handling/processing,9,10 tissue sources, and RNA extraction methods,11 all are known to affect the quality of RNA12 and therefore can potentially be sources of bias for expression profiling studies. Formalin reacts with proteins and nucleic acids causing cross-linking and chemical modification of RNA13,14,15,16,17,18,19,20,21,22 leading to lower yields, shorter, more degraded RNA, lacking in poly-A tail, which must be taken into account when selecting a suitable reverse transcription protocol for expression profiling.23 Other factors such as the degree of successful reversal of protein-nucleic acid cross-linking, chemical modifications to RNA, and the fact that shorter, more fragmented RNA molecules have statistically fewer hybridization sites for primer binding all make FFPE RNA less efficient template for reverse transcription or amplification.14,17,18,19

Although several studies have indeed reported useable expression data from FFPE specimens20,24,25,26,27 and a few have characterized differences between samples with ‘intact’ (e.g., FF) and degraded (e.g., FFPE) RNA,7,12,24,25,28,29,30,31,32,33,34,35,36,37,38,39,40 systematic investigation of these differences have remained largely unexplored. In this study, we performed a comparative analysis of archived, matched FF, and FFPE Wilm's tumor and normal tissue pairs (‘quadsets’) as well as similar quadsets from lung and colon samples. RNA was amplified using the WT-Ovation FFPE System,20,41 optimized for amplification of nanogram quantities of FFPE RNA and hybridized to standard Affymetrix U133 Plus 2 GeneChips. Here we show that whereas overall FFPE samples yielded RNA of poor quality relative to matched FF samples, we obtained data of sufficient quality for differential gene expression studies using combined random and poly-A priming whole transcript amplification for oligonucleotide microarrays. In addition, we assess the reliability of microarray analysis using a novel contrast metric function that can be used to derive filtered gene lists for microarray experiments using RNA from FFPE material.

Materials and Methods

Tissue Samples

Matched FFPE and frozen samples of Wilm's tumor and normal adjacent kidney tissue (‘quad-sets’) were collected from patients at Children's Hospital Los Angeles according to an institutional review board–approved protocol. FFPE archived blocks and frozen tissue quad-sets were obtained from three patients who underwent surgical resection 0.5, 1.5, and 7.5 years before tissue processing and nucleic acid extraction. For each patient sample three 10-micron sections (approximately 1 cm2 in area) were cut from both formalin-fixed paraffin embedded blocks as well as matched frozen specimens and placed in microfuge tubes for immediate RNA extraction. Lung and colon quadsets samples were independently processed by NuGEN Technologies as previously described (Technical Report #3, 2007, available at http://www.nugeninc.com/tasks/sites/nugen/assets/File/technical_documents/techdoc_wt_ov_ffpe_rep_03.pdf, last accessed May 21, 2010). Additional details of the individual samples used in this analysis can be found in supplemental Table S1 at http://jmd.amjpathol.org.

RNA Extraction

RNA was extracted and purified from Wilm's FFPE tissue sections using the commercially available Formapure nucleic acid extraction kit (Agencourt Biosciences, Beverly MA). RNA was extracted from Wilm's frozen tissue sections using TRIzol (Invitrogen, Carlsbad, CA) and purified using RNeasy Protect kit (Qiagen, Valencia, CA). RNA was further purified using DNase I treatment (Ambion, Austin, TX) to eliminate any contaminating DNA. RNA concentrations were calculated using a Nanodrop ND-1000 spectrophotometer (Nanodrop Technologies, Rockland, DE). RNA integrity was evaluated by running electropherograms and RNA integrity number, RIN42 (a correlative measure that indicates intactness of mRNA) was determined using the RNA 6000 PicoAssay for the Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA).

RNA Amplification and GeneChip Hybridization

Purified RNA was subjected to whole-transcriptome amplification using the WT-Ovation FFPE system. For RNA extracted from frozen or FFPE sections, 10 ng or 50 ng of input RNA, respectively, was used to generate amplified Ribo-SPIA product. All clean-up steps were performed with RNAClean magnetic beads (Agencourt Biosciences). Five micrograms of WT-Ovation product were used to fragment and label using the FL-Ovation Biotin V2 labeling module, and labeled product was hybridized to Affymetrix Human U133 Plus 2.0 (Wilm's quadset) or U133Av2 (colon and lung quadsets) GeneChips following manufacturer's recommendations (Affymetrix, Santa Clara, CA).

Data Analysis

Raw CEL files were used in low level analysis. Data preprocessing and gene-level summarization was done in Affymetrix Expression Console Software (version 1.1), where FFPE and FF samples were processed separately to avoid biases. All array data were processed in the statistical language ‘R’43 using several packages in Bioconductor.44 In particular, probe level data were read by affy package, probe sequence data were obtained using Bioconductor annotation library (hgu133plus2), and Venn diagrams were generated using limma package.45 Gene ontology overrepresentation analysis was conducted using the DAVID online tool.46 Data for lung and colon quadsets were kindly provided by NuGEN Technologies. Cross-tissue comparisons involving the two different Affymetrix GeneChips (i.e., U133A for colon and lung and Plus 2.0 for Wilm's quadsets) were based on the signal from 22,515 overlapping probesets. Please refer to NCBI Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) study GSE19249 to access Affymetrix CEL file data.


RNA Analysis

We first evaluated yields and some basic nucleic acid quality metrics for quadset RNA extractions, obtaining results consistent with previous reports in the literature.13,16,18,19,20,21,22,47,48,49 Representative data from Wilm's quadset show RNA yields were dependent on both the fixation method and the sample block archive age (Figure 1A). Overall, greater than twofold more RNA was extracted from FF than FFPE samples (t test, P < 0.02) and although not statistically significant, approximately twofold more RNA was extracted from more recently archived blocks (0.5 and 1.5 years) as compared with older blocks (7.5 years), an observation that was more pronounced in FFPE samples (more than fourfold more RNA).

Figure 1
Analysis of RNA from Wilm's tumor and paired kidney samples from FFPE and FF quadsets. A: Yields of RNA extracted from FF and FFPE samples archived for various block ages. B: Mean yields of amplified DNA from FFPE and FF generated using WT-Ovation FFPE ...

Next, we looked at the distribution of RNA sizes (to assess integrity of extracted RNA molecules) on agarose gel and by analyzing electropherogram traces (data not shown). In FFPE samples, the majority of RNAs were <500 nucleotides in length and 18S and 28S ribosomal peaks were missing but in FF samples ribosomal peaks were clearly observable in all samples within a characteristic ‘smear’ of RNAs of various sizes, typical of high quality total RNA extractions.50 Bioanalyzer software was used to assign RNA Integrity Number (RIN) values for extractions ranging from 1 to 10, with 1 being the most degraded profile and 10 being the most intact.42 Data from the Wilm quadsets shows that RIN values were significantly lower in FFPE samples (mean, 2.2 ± 0.1) compared with FF samples (8.2 ± 0.6). Similar results were observed in the lung and colon quadsets (data not shown).

Finally, the suitability of extracted RNA as a template for reverse transcriptase and subsequent amplification for microarray hybridization was evaluated (Figure 1B). The mean concentration of amplification product reactions (recorded after the second amplification round before biotin labeling) from FF RNA was 1.5-fold greater than FFPE RNA (P < 0.0004) even though the amount of RNA input was fivefold greater in FFPE samples (as recommended by manufacturer). Adjusting for RNA input, there was about a ninefold increase in FF amplification yields than FFPE samples. Additionally, FFPE RNA yield was twofold greater in recently archived blocks (0.5 and 1.5 years) compared with older blocks (7.5 years; P < 0.009). Although extractions from FFPE yielded lower quantities of RNA that appeared more degraded than from FF, sufficient amplified product for microarray hybridization was generated from quadsets evaluated. Together, these findings indicate that although in patient-matched starting material, FFPE tissues yield less RNA and amplified cDNA than FF tissues, this material is still suitable for whole-transcriptome amplification and yields sufficient for microarray hybridization.34

Microarray QC Analysis

Standard QC analysis workflows, available in Affymetrix Expression Console Software (available at: http://www.affymetrix.com/support/downloads/manuals/expression_console_userguide.pdf, last accessed May 21, 2010), were used to derive QC metrics (Table 1). These metrics are commonly used to evaluate whether or not expression data are of sufficient quality for downstream analysis. The percentage of genes called present (%P) was ~1.4-fold higher in FF than FFPE. Scale factor is used to scale chip raw intensity to an arbitrary value and reflects the fact that FFPE signal is systematically lower than FF. Mean absolute deviation (MAD) shows that FFPE samples had significantly higher variance than FF. All quantitative measures were found to be consistent within the assay, while differing significantly between assays (i.e., FFPE versus FF sample preparations). Significant differences in QC metrics of FFPE and FF material indicate that probe modeling should be performed for FFPE and FF assays separately as these differences could potentially skew normalized data, confound results, and lead to false interpretations.51 To further explore microarray signal dynamics between FFPE and FF we looked at raw signal analysis to determine whether we could observe subsets of genes or probesets that report signal similarly from FFPE and FF.

Table 1
Data Quality Metrics Generated Using Affymetrix® Expression Console™ Software to Help Monitor Data Quality

Raw Signal Analysis

We next examined overall probe signal behavior in both assays at the raw probe level. Raw intensities of 604,258 probes for each assay were examined to determine key signal characteristics. Analysis of probe level signal densities revealed distinct difference in the shapes of density plots, with FFPE data exhibiting reduced dynamic range of raw probe signal compared with FF signal density profile (Figure 2A). These density plots also illustrate the high reproducibility of raw signal distribution within the same sample type despite tissue type, block age, etc., with median within-assay raw signal Pearson correlation of 0.91 and 0.92 in FF and FFPE, respectively. Scatterplots of log2 intensity of median signal across normal kidney FF samples versus log2 intensity of median signal across normal kidney FFPE samples show Pearson correlation coefficients of 0.75, indicating significantly lower similarity for probe signal between FF and FFPE assays (Figure 2B) and in accordance with previously reported results with a different assay system.34

Figure 2
A: Raw signal density, FFPE (blue) and FF (red). B: Scatterplot of log2 raw intensity of median FFPE normal samples (y axis) versus median FF normal samples (x axis).

To study probe signal as a function of 5′–3′ position of probes within a probeset, the average intensity of the probes was plotted as a function of probe position. Namely, probes within each probeset were numbered directionally from the 5′ end to the 3′ end, and probe intensities were averaged by probe number across all transcripts independently for each sample. Figure 3 shows so-called ‘degradation plots’ of FFPE and FF profiles. We observe characteristic linear degradation curves in fresh frozen samples but distinct nonlinear profiles in FFPE, suggesting a relatively sharp decay of signal intensity toward 5′ end compared with 3′ end. These data suggest that FFPE samples are more sensitive to the 3′ biased probe-design of the Gene Chips than FF samples and that RNA in FFPE samples is subjected to more chemical degradation, which occurs from 5′ to 3′ ends.28,52

Figure 3
The degradation plots, based on ordering probes within a probeset according to their 3′ position and then combining the signal from similarly located probes across the array. Each line corresponds to an array, red for FFPE and blue for FF, and ...

To further study differences in individual probe performance, we defined a ‘contrast metric’ as the median pair-wise difference in log2 signals between the FF and FFPE assays. We calculated the difference in log2 probe signal for every matching pair of FFPE and FF samples, and the median difference across all pairs was used as a measure of assay dissimilarity for a given probe. We observe that probe contrast has a positive mean (i.e., systematically higher raw signal in FF samples compared with FFPE) and an asymmetrical shape, depicted in the contrast density plot on Figure 4A. The elongated right shoulder of the density plot illustrates that probes are not affected uniformly and suggests a nonuniform FFPE bias. To track the origin of the positive contrast, we evaluated probe contrast behavior within each probeset to determine whether probe performance is compromised for a group of probesets or whether the shape of the contrast metric curve is dependent on other variables such as probe GC content or probe interrogation position. While we did not find any dependence of contrast on probe GC content, strand or chromosomal location (supplemental Figure S1 at http://jmd.amjpathol.org), we observed that the median contrast density within probesets was highly similar to that observed on the raw probe level (Figure 4B), suggesting that probesets (i.e., genes) are also affected nonuniformly, with some genes’ expression levels being compromised more than others. This observation is further strengthened by determining the relationship between median and maximum contrast within a probeset (Figure 4C), which shows a tight linear dependence indicating that the most compromised probes belong to probesets with high median contrast. Collectively, these results show that whereas probe behavior in FFPE versus FF varies from probeset to probeset, probe behavior is consistent within probesets and therefore specific genes.

Figure 4
A: Density function of probe contrast (i.e., median pairwise difference) in log2 FF and FFPE signal across normal (i.e., nontumorous) samples. Similar contrast function was observed in tumor samples. B: Density function of median probe contrast within ...

To further study probesets with the poorest probe performance in FFPE tissues we analyzed a subset of probesets that showed a minimum probe contrast of >2 within each probeset [i.e., all of the probe intensities within these probesets are at least 4 times smaller (on the natural scale) in FFPE samples versus FF samples (supplemental Figure S2 at http://jmd.amjpathol.org)]. To determine the reproducibility of observed FFPE bias across different tissue types, probe contrast metrics were calculated for lung and colon quadsets. These quadsets were generated from RNA extracted using different protocols from the Wilm quadset samples but used the same amplification protocol. We observe that probe contrast metric in lung and colon follow a characteristic asymmetrical density shape, as observed in kidney tissue (supplemental Figure S3 at http://jmd.amjpathol.org). Moreover, despite differences in expression between tumor and normal pairs in quadsets and between tissues, median contrast within probeset has a high degree of similarity and concordance across all tissues (Figure 5A). As depicted in the Venn diagram (Figure 5B), more than 3000 probesets had a median contrast of >2 in all three quadsets (supplemental Table S2 at http://jmd.amjpathol.org). This represents approximately 15% of the probe sets in common between the two platforms studied (Affymetrix U133A and Plus2.0 GeneChips). At least 1/3 of the probesets with a median contrast of >2 overlap in all three tissue quadsets (86% of lung, 36% of colon, and 33% of Wilm's median contrast probesets). In two-way comparisons, between 78 and 95% of the median contrast >2 probesets were detected in at least two of the three tissue quadsets.

Figure 5
A: Density of median probe contrast within a probeset in kidney, lung, and colon tissues. B: Venn diagram of median within-probeset contrast larger than two in kidney, lung, and colon data sets.

Given the overlap between quadset tissues and the strong bias of specific genes across multiple tissues, we next used overrepresentation analysis of gene ontology categories and biological pathways46 to determine whether any of these genes are involved in common biological processes, cellular localization, or share similar molecular functions. The subset of probesets that showed the most pronounced FFPE bias (i.e., contrast metric of >2 in all three quadsets) revealed highly significant enrichment of specific groups of genes (supplemental Table S3 at http://jmd.amjpathol.org). Most significant biological categories were genes involved in protein biosynthesis such as translation initiation, elongation, and ribosomal subunit genes. Seventy-four percent of the Gene Ontology cellular component category of ‘cytosolic ribosome’ (57/77 genes with this tag in GO) were overrepresented in this list (P < 9e-25). In addition, we found enrichment of specific protein domains such as the ubiquitous eukaryotic RNA recognition motif (RRM) and nucleotide-binding, α-β plait domains, which are present in proteins that bind single stranded RNA molecules such as splicing factors.53 Other intriguing gene ontology categories revealed in this analysis included molecular chaperonins involved in unfolded protein binding and genes localized to the endoplasmic reticulum and inner mitochondrial membranes. The overrepresentation analysis suggests that despite differences in tissue type, sample processing, and RNA extraction there exists a nonrandom bias in FFPE-derived expression profiles. Therefore, investigators can expect to observe depleted expression signal from transcripts associated with RNA binding, especially the ribosomal machinery itself, perhaps because these RNA transcripts are more tightly bound to protein (than other RNA transcripts) when the samples are submerged in the formalin fixative and/or are chemically cross-linked to a greater extent than other transcripts.

Gene-Level Analysis

Next, we study gene-level data to determine the capabilities of FFPE microarray data in studying differential gene expression. Probe signal intensities were quantile normalized and gene-level expressions were summarized using the robust multiarray average (RMA), where probe-modeling was performed independently for FFPE and FF samples. Independent t tests were performed at the gene level to compare Wilm's tumor and normal kidney expressing in FFPE and FF samples (supplemental Table S4 at http://jmd.amjpathol.org). Considerable overlap was observed between differentially expressed gene lists detected in FFPE and FF under widely used and nonstringent mean-fold difference and t test P value thresholds (FD >2 & P value <0.01), the overlap representing more than 50% of the total number of differentially expressed genes in FFPE tissues (Figure 6A). Although the overlap observed is significant, the concordance in magnitude of fold change was only marginal with fold changes varying significantly in FF and FFPE comparisons (Figure 6B). Importantly, we find that proportion of genes concordantly over- or underexpressed in both FFPE and FF tissues increases significantly as a function of observed mean-fold difference between tumor and normal (Figure 6C). For example, when the threshold for differential expression was increased to detect genes differentially expressed with a mean fold-difference of >5, the concordance between FFPE and FF was 90%. Recognizing the fact that FF data are of higher quality and considered as gold standard, we calculated the degree of overlap between FF and FFPE data at different P value cut-offs for FFPE tumor/normal comparison while keeping FF cut-off at non-stringent thresholds stated above (Figure 7). We find that concordance increases significantly with more stringent P value cut-off while number of differentially expressed genes is reduced. For example, at P value cut-off of 0.005, we report only 60% of differentially expressed genes compared with nonstringent cut-off, but the overlap between FFPE and FF tumor/normal comparison increases significantly to 80%. Therefore, applying more stringent thresholds than are customary for fresh frozen samples studies (e.g., 10-fold increase in P value stringency) will likely decrease the false-positive rate and yield higher accuracy for ‘noisier’ FFPE sample expression profiling studies.

Figure 6
A: Venn diagram, showing concordance of differential expression in gene expression of Wilm's tumor and normal adjacent kidney in FF and FFPE assays. B: Scatterplot of mean fold-difference between normal kidney and Wilm's tumor found significant in both ...
Figure 7
A side-by-side view of P value threshold (x axis) in FFPE tumor/normal comparison versus concordance with FF differentially expressed gene list under nonstringent cut-offs (bottom) and fraction of genes reported (top).


This study protocol using recent improvements to nucleic acid extraction,12,16 including more effective reversal of the chemical cross-links in FFPE RNA54 and the use of a combined random and poly-A priming approach (i.e., Ovation FFPE system20), demonstrates that whole-transcriptome microarray data of sufficient quality for downstream analysis can be obtained from FFPE specimens. Additionally, using such a protocol for whole transcriptome amplification avoids the inherent bias of using gene-specific priming2,30,33,37,40,55,56 to look at only a small subset of the transcriptome and greatly enables studies from samples with low RNA yields (e.g., FFPE core biopsies).

To understand the comparability of FFPE and FF expression profiles, we studied lists of differentially expressed transcripts in the tumor and normal adjacent tissue across various tissue types. We looked at individual probe performance and confirmed that in FFPE samples, probe performance was more significantly affected by proximity to the 3′ and declined sharply toward 5′ end, which could reflect the fact that the FFPE preservation induces some 5′ to 3′ chemical degradation to RNA from the fixation and paraffin embedding process itself (e.g., chemical oxidation and heating).26,28 We observed a significant overlap between differentially expressed gene lists detected in FFPE and FF tumor normal pairs using standard fold change/P value thresholds. The concordance increases significantly when a higher fold change and P value cut-off is applied, resulting in smaller number of genes reported but gauged to FF samples as benchmarks, presumably fewer false positives. Finally, we found that independent of both RNA extraction protocol (column or magnetic bead based) and tissue type (e.g., Wilm's, colon and lung) we found a nonrandom systematic gene-specific ‘FFPE bias.’ Some of this bias may be accounted for by differences in amplification efficiency of RNAs extracted from FF and FFPE samples (yields were ~9-fold greater in FF when adjusting for input RNA amount). Intriguingly, the genes most affected by the FFPE bias (i.e., contrast metric >2) were overrepresented with gene ontology terms for proteins involved in ‘translation initiation,’ ‘translation elongation,’ ‘RNA-protein complex assembly,’ ‘structural constituent of ribosome,’ and ‘RNA binding.’ Further investigations with the Affymetrix Human Exon 1.0 ST microarrays are underway to better characterize the ‘FFPE’ bias. These microarrays have probes arranged throughout the length of the transcript and are better suited for evaluating the effects of transcript length and location of probes in the RNA transcript than the 3′-biased microarrays used in the present study.

Although, it remains unclear why the detection of transcripts representing translational machinery protein components is most compromised in FFPE sample analysis, these data suggest that these RNA transcripts were most tightly bound to proteins at fixation, which could make cross-linking or chemical modifications to RNA less efficiently reversed in a nonrandom and reproducible manner in multiple tissues. Although further experimentation is required on a larger sampling of FFPE specimens (e.g., additional tissue types and representation of older specimens), these efforts will be useful for devising reliable ‘mask’ sets to filter out probesets most affected by the FFPE bias before downstream analysis to minimize false discovery.

Limitations of FFPE RNA in terms of nucleic acid integrity and suitability as template for amplification are outweighed by the benefits of having almost unlimited supply of study material for retrospective analyses. With several commercially available kits for nucleic acid extraction that appear to work equally well,16 and a robust RNA amplification kit such as the NuGEN Ovation FFPE system,20,41 we achieved functional, reproducible, and reliable microarray data from FFPE material from multiple tissues.


Despite the highly cross-linked and degraded nature of RNA in FFPE tissue blocks, functional gene array data can be routinely generated using commercially available RNA extraction and amplification kits.


We thank Drs. Victor Sementchenko, Gianfranco de Feo, and Hana Gage (NuGEN Technologies, Inc.) for lung and colon quadset data as well as insights and valuable discussions, and Drs. Hiroyuki Shimada, Ignacio Gonzalez, and Minerva Mongeotti (CHLA Department of Pathology) for providing Wilm's tissue samples used in this study and invaluable pathology expertise. We acknowledge the late Dr. James W. Jacobson, Ph.D., of the diagnostic biomarkers and technology branch of the National Cancer Institute.


Supported in part by the National Cancer Institute, Strategic Partnering to Evaluate Cancer Signatures (U01CA-114757, to T.J.T.).

Supplemental material for this article can be found on http://jmd.amjpathol.org.

Current address of D.A.: Affymetrix, Inc., Emeryville, CA.

Web Extra Material

Supplementary Figure 1:

Probe contrast vs. binned probe GC content [blue] and probe GC content histogram is shown below [red].

Supplementary Figure 2:

Density function of probe contrasts, i.e. difference in median log2 FF signal and median log2 FFPE signal across normal (i.e. non-tumor) samples [in blue], and the selected subset density of most affected probes, defined as those with median probeset contrast of > 2 [in red].

Supplementary Figure 3:

Scatterplot of median probe contrasts within probeset in kidney vs. lung [blue] and colon [red] tissue specimens (overlap in black).

Supplementary Table 2:

Probesets that show a contrast metric function of >2 in Wilm's, lung and colon quadsets (n=3,014). Columns indicate Affymetrix ID, Gene name and symbol, chromosomal location and length of the gene.


1. Dumur CI, Lyons-Weiler M, Sciulli C, Garrett CT, Schrijver I, Holley TK, Rodriguez-Paris J, Pollack JR, Zehnder JL, Price M, Hagenkord JM, Rigl CT, Buturovic LJ, Anderson GG, Monzon FA. Interlaboratory performance of a microarray-based gene expression test to determine tissue of origin in poorly differentiated and undifferentiated cancers. J Mol Diagn. 2008;10:67–77. [PMC free article] [PubMed]
2. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL, Bryant J, Wolmark N. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004;351:2817–2826. [PubMed]
3. Raz DJ, Ray MR, Kim JY, He B, Taron M, Skrzypski M, Segal M, Gandara DR, Rosell R, Jablons DM. A multigene assay is prognostic of survival in patients with early-stage lung adenocarcinoma. Clin Cancer Res. 2008;14:5565–5570. [PubMed]
4. van de Vijver MJ, He YD, van't Veer LJ, Dai H, Hart AA, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, van der Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347:1999–2009. [PubMed]
5. Tothill RW, Kowalczyk A, Rischin D, Bousioutas A, Haviv I, van Laar RK, Waring PM, Zalcberg J, Ward R, Biankin AV, Sutherland RL, Henshall SM, Fong K, Pollack JR, Bowtell DD, Holloway AJ. An expression-based site of origin diagnostic method designed for clinical application to cancer of unknown origin. Cancer Res. 2005;65:4031–4040. [PubMed]
6. Lossos IS, Czerwinski DK, Alizadeh AA, Wechser MA, Tibshirani R, Botstein D, Levy R. Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. N Engl J Med. 2004;350:1828–1837. [PubMed]
7. Malumbres R, Chen J, Tibshirani R, Johnson NA, Sehn LH, Natkunam Y, Briones J, Advani R, Connors JM, Byrne GE, Levy R, Gascoyne RD, Lossos IS. Paraffin-based 6-gene model predicts outcome in diffuse large B-cell lymphoma patients treated with R-CHOP. Blood. 2008;111:5509–5514. [PMC free article] [PubMed]
8. Oberli A, Popovici V, Delorenzi M, Baltzer A, Antonov J, Matthey S, Aebi S, Altermatt HJ, Jaggi R. Expression profiling with RNA from formalin-fixed, paraffin-embedded material. BMC Med Genomics. 2008;1:9. [PMC free article] [PubMed]
9. van Maldegem F, de Wit M, Morsink F, Musler A, Weegenaar J, van Noesel CJ. Effects of processing delay, formalin fixation, and immunohistochemistry on RNA recovery from formalin-fixed paraffin-embedded tissue sections. Diagn Mol Pathol. 2008;17:51–58. [PubMed]
10. Chung JY, Braunschweig T, Williams R, Guerrero N, Hoffmann KM, Kwon M, Song YK, Libutti SK, Hewitt SM. Factors in tissue handling and processing that impact RNA obtained from formalin-fixed, paraffin-embedded tissue. J Histochem Cytochem. 2008;56:1033–1042. [PMC free article] [PubMed]
11. Al-Mulla F. Utilization of microarray platforms in clinical practice: an insight on the preparation and amplification of nucleic acids from frozen and fixed tissues. Methods Mol Biol. 2007;382:115–136. [PubMed]
12. Ribeiro-Silva A, Zhang H, Jeffrey SS. RNA extraction from ten year old formalin-fixed paraffin-embedded breast cancer samples: a comparison of column purification and magnetic bead-based technologies. BMC Mol Biol. 2007;8:118. [PMC free article] [PubMed]
13. Chung JY, Braunschweig T, Hewitt SM. Optimization of recovery of RNA from formalin-fixed, paraffin-embedded tissue. Diagn Mol Pathol. 2006;15:229–236. [PubMed]
14. Davies GN, Bevan IS, Lundemose JB, Smith H, Sweet C. Use of proteinase K for RT-PCR of cytokine mRNA in formalin fixed tissue. Clin Mol Pathol. 1996;49:M364–M367. [PMC free article] [PubMed]
15. Ding J, Ichikawa Y, Ishikawa T, Shimada H. Effect of formalin on extraction of mRNA from a formalin-fixed sample: a basic investigation. Scand J Clin Lab Invest. 2004;64:229–235. [PubMed]
16. Gilbert MT, Haselkorn T, Bunce M, Sanchez JJ, Lucas SB, Jewell LD, Van Marck E, Worobey M. The isolation of nucleic acids from fixed, paraffin-embedded tissues-which methods are useful when? PLoS ONE. 2007;2:e537. [PMC free article] [PubMed]
17. Krafft AE, Duncan BW, Bijwaard KE, Taubenberger JK, Lichy JH. Optimization of the isolation and amplification of RNA from formalin-fixed paraffin-embedded tissue: the Armed Forces Institute of Pathology experience and literature review. Mol Diagn. 1997;2:217–230. [PubMed]
18. Masuda N, Ohnishi T, Kawamoto S, Monden M, Okubo K. Analysis of chemical modification of RNA from formalin-fixed samples and optimization of molecular biology applications for such samples. Nucleic Acids Res. 1999;27:4436–4443. [PMC free article] [PubMed]
19. Mizuno T, Nagamura H, Iwamoto KS, Ito T, Fukuhara T, Tokunaga M, Tokuoka S, Mabuchi K, Seyama T. RNA from decades-old archival tissue blocks for retrospective studies. Diagn Mol Pathol. 1998;7:202–208. [PubMed]
20. Scicchitano MS, Dalmas DA, Bertiaux MA, Anderson SM, Turner LR, Thomas RA, Mirable R, Boyce RW. Preliminary comparison of quantity, quality, and microarray performance of RNA extracted from formalin-fixed, paraffin-embedded, and unfixed frozen tissue samples. J Histochem Cytochem. 2006;54:1229–1237. [PubMed]
21. Stanta G, Bonin S, Perin R. RNA extraction from formalin-fixed and paraffin-embedded tissues. Methods Mol Biol. 1998;86:23–26. [PubMed]
22. von Ahlfen S, Missel A, Bendrat K, Schlumpberger M. Determinants of RNA quality from FFPE samples. PLoS ONE. 2007;2:e1261. [PMC free article] [PubMed]
23. Farragher SM, Tanney A, Kennedy RD, Paul Harkin D. RNA expression analysis from formalin fixed paraffin embedded tissues. Histochem Cell Biol. 2008;130:435–445. [PubMed]
24. Coudry RA, Meireles SI, Stoyanova R, Cooper HS, Carpino A, Wang X, Engstrom PF, Clapper ML. Successful application of microarray technology to microdissected formalin-fixed, paraffin-embedded tissue. J Mol Diagn. 2007;9:70–79. [PMC free article] [PubMed]
25. Frank M, Doring C, Metzler D, Eckerle S, Hansmann ML. Global gene expression profiling of formalin-fixed paraffin-embedded tumor samples: a comparison to snap-frozen material using oligonucleotide microarrays. Virchows Arch. 2007;450:699–711. [PubMed]
26. Linton KM, Hey Y, Saunders E, Jeziorska M, Denton J, Wilson CL, Swindell R, Dibben S, Miller CJ, Pepper SD, Radford JA, Freemont AJ. Acquisition of biologically relevant gene expression data by Affymetrix microarray analysis of archival formalin-fixed paraffin-embedded tumours. Br J Cancer. 2008;98:1403–1414. [PMC free article] [PubMed]
27. Walter MA, Seboek D, Demougin P, Bubendorf L, Oberholzer M, Muller-Brand J, Muller B. Extraction of high-integrity RNA suitable for microarray gene expression analysis from long-term stored human thyroid tissues. Pathology. 2006;38:249–253. [PubMed]
28. Srivastava PK, Kuffer S, Brors B, Shahi P, Li L, Kenzelmann M, Gretz N, Grone HJ. A cut-off based approach for gene expression analysis of formalin-fixed and paraffin-embedded tissue samples. Genomics. 2008;91:522–529. [PubMed]
29. Rogerson L, Darby S, Jabbar T, Mathers ME, Leung HY, Robson CN, Sahadevan K, O'Toole K, Gnanapragasam VJ. Application of transcript profiling in formalin-fixed paraffin-embedded diagnostic prostate cancer needle biopsies. BJU Int. 2008;102:364–370. [PubMed]
30. Ravo M, Mutarelli M, Ferraro L, Grober OM, Paris O, Tarallo R, Vigilante A, Cimino D, De Bortoli M, Nola E, Cicatiello L, Weisz A. Quantitative expression profiling of highly degraded RNA from formalin-fixed, paraffin-embedded breast tumor biopsies by oligonucleotide microarrays. Lab Invest. 2008;88:430–440. [PubMed]
31. Hoshida Y, Villanueva A, Kobayashi M, Peix J, Chiang DY, Camargo A, Gupta S, Moore J, Wrobel MJ, Lerner J, Reich M, Chan JA, Glickman JN, Ikeda K, Hashimoto M, Watanabe G, Daidone MG, Roayaie S, Schwartz M, Thung S, Salvesen HB, Gabriel S, Mazzaferro V, Bruix J, Friedman SL, Kumada H, Llovet JM, Golub TR. Gene expression in fixed tissues and outcome in hepatocellular carcinoma. N Engl J Med. 2008;359:1995–2004. [PMC free article] [PubMed]
32. Furusato B, Shaheduzzaman S, Petrovics G, Dobi A, Seifert M, Ravindranath L, Nau ME, Werner T, Vahey M, McLeod DG, Srivastava S, Sesterhenn IA. Transcriptome analyses of benign and malignant prostate epithelial cells in formalin-fixed paraffin-embedded whole-mounted radical prostatectomy specimens. Prostate Cancer Prostatic Dis. 2008;11:194–197. [PubMed]
33. Bibikova M, Yeakley JM, Wang-Rodriguez J, Fan JB. Quantitative expression profiling of RNA from formalin-fixed, paraffin-embedded tissues using randomly assembled bead arrays. Methods Mol Biol. 2008;439:159–177. [PubMed]
34. Penland SK, Keku TO, Torrice C, He X, Krishnamurthy J, Hoadley KA, Woosley JT, Thomas NE, Perou CM, Sandler RS, Sharpless NE. RNA expression analysis of formalin-fixed paraffin-embedded tumors. Lab Invest. 2007;87:383–391. [PubMed]
35. Loudig O, Milova E, Brandwein-Gensler M, Massimi A, Belbin TJ, Childs G, Singer RH, Rohan T, Prystowsky MB. Molecular restoration of archived transcriptional profiles by complementary-template reverse-transcription (CT-RT) Nucleic Acids Res. 2007;35:e94. [PMC free article] [PubMed]
36. Glenn ST, Jones CA, Liang P, Kaushik D, Gross KW, Kim HL. Expression profiling of archival renal tumors by quantitative PCR to validate prognostic markers. Biotechniques. 2007;43:639–647. [PubMed]
37. Clark-Langone KM, Wu JY, Sangli C, Chen A, Snable JL, Nguyen A, Hackett JR, Baker J, Yothers G, Kim C, Cronin MT. Biomarker discovery for colon cancer using a 761 gene RT-PCR assay. BMC Genomics. 2007;8:279. [PMC free article] [PubMed]
38. Castiglione F, Degl'Innocenti DR, Taddei A, Garbini F, Buccoliero AM, Raspollini MR, Pepi M, Paglierani M, Asirelli G, Freschi G, Bechi P, Taddei GL. Real-time PCR analysis of RNA extracted from formalin-fixed and paraffin-embeded tissues: effects of the fixation on outcome reliability. Appl Immunohistochem Mol Morphol. 2007;15:338–342. [PubMed]
39. Stenman J, Rasanen J, Tenkanen T, Haglund C, Salo J, Orpana A, Paju A. Genome-controlled reverse transcriptase-polymerase chain reaction for targeted gene-expression analysis. Scand J Clin Lab Invest. 2006;66:597–606. [PubMed]
40. Haller AC, Kanakapalli D, Walter R, Alhasan S, Eliason JF, Everson RB. Transcriptional profiling of degraded RNA in cryopreserved and fixed tissue samples obtained at autopsy. BMC Clin Pathol. 2006;6:9. [PMC free article] [PubMed]
41. Linton K, Hey Y, Dibben S, Miller C, Freemont A, Radford J, Pepper S. Methods comparison for high-resolution transcriptional analysis of archival material on Affymetrix Plus 2.0 and Exon 1.0 microarrays. Biotechniques. 2009;47:587–596. [PubMed]
42. Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, Lightfoot S, Menzel W, Granzow M, Ragg T. The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol. 2006;7:3. [PMC free article] [PubMed]
43. Team RDC. R: A language and environment for statistical computing. Austria: R Foundation for Statistical Computing; Vienna: 2008.
44. Gentleman RIaR. R: A language for data analysis and graphics. J Computat Graph Stat. 1996;5:299–314.
45. Smyth GK. In: Limma: linear models for microarray data. Gentleman R VC, Dudoit S, Irizarry R, Huber W, editors. Springer; New York: 2005. pp. 397–420.
46. Dennis G, Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA. DAVID: database for annotation, visualization, and integrated discovery. Genome Biol. 2003;4:P3. [PMC free article] [PubMed]
47. Jarzab M, Rozanowski P, Kowalska M, Zebracka J, Rudnicka L, Stobiecka E, Jarzab B, Stachura J, Pawlega J. Optimization of the method of RNA isolation from paraffin blocks to assess gene expression in breast cancer. Pol J Pathol. 2008;59:85–91. [PubMed]
48. Madabusi LV, Latham GJ, Andruss BF. RNA extraction for arrays. Methods Enzymol. 2006;411:1–14. [PubMed]
49. Rupp GM, Locker J. Purification and analysis of RNA from paraffin-embedded tissues. Biotechniques. 1988;6:56–60. [PubMed]
50. Copois V, Bibeau F, Bascoul-Mollevi C, Salvetat N, Chalbos P, Bareil C, Candeil L, Fraslon C, Conseiller E, Granci V, Maziere P, Kramar A, Ychou M, Pau B, Martineau P, Molina F, Del Rio M. Impact of RNA degradation on gene expression profiles: assessment of different methods to reliably determine RNA quality. J Biotechnol. 2007;127:549–559. [PubMed]
51. Heber S, Sick B. Quality assessment of Affymetrix GeneChip data. OMICS. 2006;10:358–368. [PubMed]
52. Lee J, Hever A, Willhite D, Zlotnik A, Hevezi P. Effects of RNA degradation on gene expression analysis of human postmortem tissues. FASEB J. 2005;19:1356–1358. [PubMed]
53. Clery A, Blatter M, Allain FH. RNA recognition motifs: boring? Not quite. Curr Opin Struct Biol. 2008;18:290–298. [PubMed]
54. Li J, Smyth P, Cahill S, Denning K, Flavin R, Aherne S, Pirotta M, Guenther SM, O'Leary JJ, Sheils O. Improved RNA quality and TaqMan Pre-amplification method (PreAmp) to enhance expression analysis from formalin fixed paraffin embedded (FFPE) materials. BMC Biotechnol. 2008;8:10. [PMC free article] [PubMed]
55. Abramovitz M, Ordanic-Kodani M, Wang Y, Li Z, Catzavelos C, Bouzyk M, Sledge GW, Jr, Moreno CS, Leyland-Jones B. Optimization of RNA extraction from FFPE tissues for expression profiling in the DASL assay. Biotechniques. 2008;44:417–423. [PMC free article] [PubMed]
56. Gianni L, Zambetti M, Clark K, Baker J, Cronin M, Wu J, Mariani G, Rodriguez J, Carcangiu M, Watson D, Valagussa P, Rouzier R, Symmans WF, Ross JS, Hortobagyi GN, Pusztai L, Shak S. Gene expression profiles in paraffin-embedded core biopsy tissue predict response to chemotherapy in women with locally advanced breast cancer. J Clin Oncol. 2005;23:7265–7277. [PubMed]

Articles from The Journal of Molecular Diagnostics : JMD are provided here courtesy of American Society for Investigative Pathology
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Compound
    PubChem Compound links
  • GEO DataSets
    GEO DataSets
    GEO DataSet links
  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles
  • Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...