![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||||
Copyright © 2008 Binder et al; licensee BioMed Central Ltd. "Hook"-calibration of GeneChip-microarrays: Chip characteristics and expression measures 1Interdisciplinary Centre for Bioinformatics, University of Leipzig, D-04107 Leipzig, Germany 2Interdisciplinary Center for Clinical Research, Medical Faculty; University of Leipzig, D-04107 Leipzig, Germany 3Max-Planck-Institute for Molecular Cell Biology and Genetics, D-01307 Dresden, Germany Corresponding author.Hans Binder: binder/at/izbi.uni-leipzig.de; Knut Krohn: krok/at/med.uni-leipzig.de; Stephan Preibisch: preibisch/at/mpi-cbg.de Received May 27, 2008; Accepted August 29, 2008. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. This article has been cited by other articles in PMC.Abstract Background Microarray experiments rely on several critical steps that may introduce biases and uncertainty in downstream analyses. These steps include mRNA sample extraction, amplification and labelling, hybridization, and scanning causing chip-specific systematic variations on the raw intensity level. Also the chosen array-type and the up-to-dateness of the genomic information probed on the chip affect the quality of the expression measures. In the accompanying publication we presented theory and algorithm of the so-called hook method which aims at correcting expression data for systematic biases using a series of new chip characteristics. Results In this publication we summarize the essential chip characteristics provided by this method, analyze special benchmark experiments to estimate transcript related expression measures and illustrate the potency of the method to detect and to quantify the quality of a particular hybridization. It is shown that our single-chip approach provides expression measures responding linearly on changes of the transcript concentration over three orders of magnitude. In addition, the method calculates a detection call judging the relation between the signal and the detection limit of the particular measurement. The performance of the method in the context of different chip generations and probe set assignments is illustrated. The hook method characterizes the RNA-quality in terms of the 3'/5'-amplification bias and the sample-specific calling rate. We show that the proper judgement of these effects requires the disentanglement of non-specific and specific hybridization which, otherwise, can lead to misinterpretations of expression changes. The consequences of modifying probe/target interactions by either changing the labelling protocol or by substituting RNA by DNA targets are demonstrated. Conclusion The single-chip based hook-method provides accurate expression estimates and chip-summary characteristics using the natural metrics given by the hybridization reaction with the potency to develop new standards for microarray quality control and calibration. 1. Background DNA microarray technology enables conducting experiments that measure RNA-transcript abundance (so called gene expression or expression degree) on a large scale of genomic sequences. The quality of the measurement systematically depends on experimental factors such as the performance of the measuring "device", e.g., on the chosen array-type, the design of the chip-platform and -generation and on the particular probe design, on one hand; and also on the quality of the sample, e.g. on the source of RNA and the used hybridization-pipeline including the protocol of RNA-extraction, -amplification and -labelling, on the other hand. Other essential factors affecting the quality of the expression measures are the quality and up-to-dateness of the genomic information probed on the chip and last but not least, the performance of the calibration algorithm which transfers raw intensity data into suited measures of transcript abundance. This so-called calibration step aims at removing systematic biases from the raw data which, in the ideal case, would allow the determination of the exact number of transcript copies of every probed transcript and thus direct comparison of expression measures independently of the used array type and sample preparation protocol. Apparent sources of variance can be, as for each experimental technique, divided into technical and biological ones, as well as, into systematic (see above) and random ones. The quality of the chip measurement and of the subsequent data calibration is characterized by their accuracy (the systematic bias between the measured and true expression value), precision (the uncertainty in replicated measurements), sensitivity (the expression range potentially covered by the measurement) and specificity (the selective power of the measurement to respond only to the specific targets). The development of appropriate calibration method requires in the first instance appropriate models and metrics to identify, to assign and to quantify the biases in each measurement. In the accompanying paper we presented the basics of the so-called hook-method, a simple and intuitive approach providing a natural metric system to characterize the hybridization on a particular array. The method divides into two essential constituents: (i) the analysis of the data in terms of the competitive two-species Langmuir hybridization model using the so-called hook-plot and (ii) the correction of the raw intensities for parasitic effects such as the non-specific hybridization, saturation and sequence-specificity to output expression measures in intrinsic units which are defined by the properties of the measuring device. The hook method is a strict single-chip calibration approach which treats each array as an independent measurement. This way the method accounts for chip-specific systematic effects which the calibration step intents to correct. In this paper we illustrate the performance of the hook method. We present examples dealing with different issues of array-measurements: the accuracy and precision of expression measures, the comparability of array experiments for different chip-generations, the effect of up-dating the probe assignments using latest genomic information, of RNA-quality and of different options of the preparation protocol such as labelling reagents and the type of the labelled molecule or replacing RNA-targets with DNA. We deliberately select a relatively wide range of different problems to illustrate the power of the method to estimate various systematic effects within a unique framework of chip-characteristic and to demonstrate the potential of developing new correction algorithms. In the first part of the paper we summarize the essential chip characteristics provided by the hook-method. In the second part special benchmark experiments are analyzed to estimate transcript related expression measures. The third part deals with hybridization quality control based on the hook analysis. 2. Chip characteristics Hook parameters Figure Figure11
The corrected hook-data are well fitted by the Langmuir-absorption model which predicts the theoretical curve shown in Figure Figure1.1
Different parts of the hook have been assigned to (see Figure Figure22 The N-range of the hook-curve is characterized by the variance of the underlying probe-level data, σ, which are well described by a normal distribution. The mean specific signal of the particular hybridization, <λ>, is calculated as log-mean of the S/N-ratio of the probe sets beyond a certain threshold (e.g. R > 0.5, see below). Note that the distribution of the specific signal is well approximated by an exponential decay in many cases. Then, the characteristic "decay" constant λ defines the Σ-range over which the probability of detecting a signal decays by one order of magnitude. Hook curves of different chip generations Figure Figure33
The different shapes of the uncorrected hook curves of the U95 and U133 chips, particularly the broader N-range of the former one, can be explained by the partially suboptimal quality of the probe selection for the U95-generation (which also applies to the design of the DG1-chip shown in Figure Figure11 We obtained analogous results for hundreds of GeneChip expression arrays of different specifications: chip generations, species (human, mouse, rat, drosophila, rice, arabidopsis etc.) and samples (patient cohorts, cell lines, benchmark experiments) [5]. Table 2 lists typical parameter-ranges obtained in these studies. For example, the PM/MM-affinity gain for specific hybridization shows that the central mismatch of the MM causes on the average the nearly tenfold (s ~ 7–11) increase of sensitivity of the PM-probes compared with that of the MM. On the contrary, for non-specific binding one expects on the average the same sensitivity for the PM- and MM-probes. The respective PM/MM-gain parameter however indicates a small but significantly increased PM-sensitivity, n ~ 1.05 – 1.25. We tentatively attribute this effect to false positive detections in the N-range, i.e. to a certain amount of specific hybridization among the absent probes (see below). The relatively narrow data-range of the obtained hybridization characteristics reflects the common physical-chemical basis of the method which is determined by properties such as the oligonucleotide density and size of the probe spots, the common MM probe-design and hybridization conditions. A particular example which demonstrates apparent inconsistencies between the expression estimates obtained from different chip-generations will be given below. Detection call The onset and further increase of specific binding gives rise to a characteristic breakpoint of the hook curve which clearly separates the N- and mix- hybridization ranges. The corresponding change of the slope of the hook curve can be rationalized in terms of relatively strongly correlated PM- and MM-intensities in the N-range which progressively "decouple" upon increasing amount of specific binding because it much stronger affects the PM than the MM. We use the breakpoint to classify the probe sets into absent and present ones in analogy with the detection call provided by MAS5 [11]. To verify the used break-criterion in a simple illustrative fashion we analysed two special chip hybridizations. The GeneChip Yeast Genome 2.0 Array (YG 2.0) contains probe sets to detect transcripts of both, the two most commonly studied species of yeast, Saccharomyces cerevisiae and Schizosaccharomyces pombe. The YG 2.0 array thus includes 5,744 probe sets for 5,841 of the 5,845 genes present in S. cerevisiae and 5,021 probe sets for all 5,031 genes present in S. pombe. The evolutionary divergence between S. cerevisiae and S. pombe over 500 million years ago caused enough sequence divergence between the two species to require selection of separate probe sets for all genes, even the closest cross-species orthologs [12]. Due to this sequence divergence one expects only weak cross-species hybridization. Figure Figure44
The second example was taken from the Golden Spike experiment in which PCR products from a Drosophila Gene Collection referring to 3,860 probes were spiked onto Drosgenome DG1-arrays [2]. On this array 10,131 probe sets out of the total number of 14,116 are called ,empty' because they are not assigned to any of the added cRNA spikes. Again the absent rate of 70% agrees with the fraction of empty probes (~72%). Selective masking of either the spiked or the empty probe sets shows that the latter ones indeed accumulate in the N-region and are called absent whereas the spikes are predominantly flagged as present (see right part in Figure Figure44 The selective masking in these both examples shows that the simple break criterion gives rise to false present calls (of potentially absent probes) of less than 5 – 7% even if one neglects cross hybridization. The break-criterion provides a sort of detection limit for the specific expression signals. The detection call thus divides the probe sets into subsets with detectable and essentially not-detectable amounts of transcripts. The false present and false absent rates depend on the degree of cross hybridization and on other factors which will be addressed below. In the next section we present other examples showing that the hook method reasonably estimates the detection limit of the particular array in terms of present and absent calls. The alternative calling-algorithm implemented in MAS5 calculates the so-called discrimination score (DS) of each probe pair which is directly related to its Δ-value [4,11]. Then, one-sided Wilcoxson's rank test is applied to the DS-values of each probe set together with appropriate threshold-settings to estimate whether the set is present or absent. The used test strongly penalizes negative PM-MM signal differences. More than 40% of all probe pairs amount to such "bright MM" (because MM > PM) in the N-range whereas its percentage steeply decreases with increasing Σ and virtually disappears in the S-range of the hook [14]. This trend explains the correlation between the call-rate obtained by both methods (see next section). For the examples presented here MAS5 provides a distinct smaller (36%) and an equal (70%) absent rate for the yeast and golden spike hybridizations, respectively. On the other hand, the hook criterion includes both, the PM-MM difference in terms of the Δ coordinate and the mean total signal in terms of Σ. The latter value adds a second threshold which prevents probe sets with relatively strong mean signals to be called absent. Moreover, the break-criterion detects rather the change of the mutual correlation between the PM and MM signals caused by the onset of specific hybridization than a certain fixed signal level. As a result, the hook-criterion "dynamically" shifts with varying signal level using the break as a simple and reasonable landmark whereas the MAS5 threshold is statically and less intuitively given in terms of p-values typically predetermined by the default settings of the used analysis program. 3. RNA-expression Benchmark experiments with variable transcript concentration Figure Figure55
The hook-method provides a virtually constant fraction of absent probes independent of the dilution step (see middle part in Figure Figure5).5 In the U133-spiked-in series of Affymetrix, a set of selected RNA-transcripts (the spikes) is added in definite concentrations to the hybridization solution [3]. The hybridization cocktail also contains a RNA-extract from HeLa-cells to mimic complex hybridization conditions. Figure Figure66 Spike probe sets without specific transcripts (0 pM) and with transcripts of only tiny concentrations (< 0.5 pM) assemble mainly within the N-range of the hook curve. Figure Figure66 The fit of the hook-equation provides the S/N-ratio R for each set of spiked-in probes which linearly correlates with the spiked in concentration (Figure (Figure6,6 Expression estimates The hook-methods provides potentially four alternative expression measures of each probe set: the S/N-ratio R, which is obtained from the direct fit of the transformed two-species Langmuir isotherm to the hook curve; and PMonly, MMonly and PM-MM-difference estimates which are calculated as the mean generalized logarithm of the background- and sensitivity corrected and de-saturated signal values averaged over the background distribution. The corrections for the latter three expression values are estimated from the hook-curve analysis. Figure Figure77
It turns out that all considered methods except MMonly are comparably precise at larger transcript concentrations csp-in > 2 pM, at which the transcripts are safely called present (see previous paragraph). Note that the direct fit of the hook equation to the data provides the S/N-ratio which represents only a rough measure of the expression degree. The PMonly and PM-MM estimates more precisely correct the signals for the non-specific background contribution. It does therefore not surprise that these measures outperform the S/N-ratio R at smaller csp-in-values in terms of precision. The MMonly expression values are by far the most imprecise ones which does not surprise because the specific signal level and thus the sensitivity of the MM-probe intensities are smaller by nearly one order of magnitude compared with the respective PMonly and PM-MM measures at a comparable non-specific background level. The coefficient of variation of the MMonly expression estimates exceeds CV > 2 over the whole concentration range which exceeds the maximum scaling used in Figure Figure77 The hook-measures clearly outperform the RMA-values in terms of the accuracy of the expression values. Note that RMA uses a linear intensity approximation which ignores saturation at high transcript concentrations at one hand-side and corrects the intensities for non-specific hybridization using a global background level on the other hand-side. As a consequence, RMA systematically underestimates the change of the expression values especially at high and small transcript concentrations (see also [5] for a detailed discussion). Note that RMA represents a multichip- method which processes a series of chips to adjust the probe-specific sensitivities. In contrast, the hook method provides strictly single-chip estimates which are based on the intensity information of only one particular chip. The accuracy of the PM-MM estimates perform best among the methods at small transcript concentrations presumably because the explicit use of the MM intensities well corrects for sequence-specific background effects not considered by the positional dependent sensitivity model used by the hook method. In this context we explicitly refer to the so-called effect of "bright" MM, i.e. a certain amount of about 40–50% of negative PM-MM intensity differences on each chip [17,18]. This systematic bias has been explained by the intrinsic purine-pyrimidine asymmetry of base pairings in the non-specific DNA/RNA probe/target duplexes [14,19,20]. The sensitivity correction used by the hook method explicitly corrects the raw intensity data for this sequence effect. Reproducibility across GeneChip-generations Up to now a large number of microarray data has been collected in public repositories such as GEO (Gene expression Omnibus of NCBI) or ArrayExpress (EBI) referring to a wide variety of different conditions, specimen and array-types. One important challenge in microarray analysis is to take full advantage of these previously accumulated data, e.g., for combining different datasets to get a more comprehensive view in comparative analyses. Difficulties related to the heterogeneous character of array platforms, chip types and hybridization protocols in most cases hinder such meta-analyses. Consistencies and inconsistencies between chip platforms and -types have been previously addressed in a number of studies [21-25]. A recent study reports that even identically composed probe sets containing identical numbers and sequences of probes on different GeneChip-types can produce significantly different values of gene expression in cross-chip comparisons for samples containing the same target RNA [10]. Particularly, this study compares the newer HG-U133 plus 2.0 (P-chip) with the previous-generation HG-U133A (A-chip) array. The nearly 55.000 probe sets of the former chip integrate the more than 22.000 probe sets of the HG-U133A chip and, in addition, the probe sets of the HG-U133B array. In the study both, the A- and P-arrays were hybridized with the same Universal Human Reference RNA. For subsequent comparison of the expression values the authors masked the additional probe sets on the P-chip ("not A"-probes) and processed only the common probe sets present on both chips ("A"-probes) using MAS5 and a combination of global and invariant-set normalizations (see ref. [10] for details). The analysis revealed a number of differentially expressed genes which is much larger than the number expected by chance despite the identical probes and target RNA. Figure Figure88
We re-analyzed these chip-data using the hook-method. The left part of Figure Figure88 In the next step we compare the hook-curves of the P- and A-chips to identify possible differences of their hybridization characteristics. Examples of raw and corrected hooks taken from this series are shown in Figure Figure33
The widths of the hooks and thus the respective level of non-specific binding are virtually the same for the P- and A-arrays. The not-A-probe sets are, on the average, distinctly less expressed than the A-probe sets as indicated by the more than twice as large amount of absent probes (%N = 64% versus 29%) and the smaller decay rate of the respective density distribution (λ = 0.45 versus 0.65). The percentage of absent probe sets on the P-chip (50%) represents the average of the respective contributions of A- and not-A-probes where the not-A-probes obviously add a considerable larger amount than the A-probes. The total density distribution of the P-chip well agrees with the distribution of the not-A-probes in the N-range and with that of the A-probes in the S- and sat-ranges. In summary, the hybridizations on both chips well agree in terms of the general target properties (N-background, decay rate) but differ with respect to the general probe characteristics (%N). The latter effect simply reflects the different probe-selections of the manufacturer for each chip type. Besides these essentially common characteristics, the hook-analysis revealed one significant difference between the chip types, namely the significantly increased height parameter α for the A-chips. This parameter characterizes the PM/MM-gain of the specific signals, or, in other words, the mean incremental effect of introducing one central mismatch into specific probe/target duplexes. Here one expects however virtually identical α-values for the A- and P-chips because the mismatch design and the nominal probe length are identical for both array-types. On the other hand, subtle deviations from the nominal probe design owing to deficiencies of fabrication and/or variations of the hybridization conditions in different preparations can however affect the observed maximum PM/MM ratio: For example, the in-situ synthesis of the GeneChip probes usually produces a non-negligible fraction of truncated probe-oligomers not synthesized to full nominal length. This effect gives rise to systematic deviations from the Langmuir isotherm and, more importantly, it will affect the PM/MM-gain because the relative effect of one middle-mismatch is expected to increase with decreasing length of the probe oligomers [26,27]. Also the post-hybridization washing step upon chip preparation is expected to affect the apparent PM/MM-ratio and the binding law as well [28,29]. We suggest that subtle differences of the hybridization law due to details of chip-manufacturing and/or handling of the chips upon preparation as well as evolving instrumentation and instrument protocols give rise to slightly biased expression data between different array types and/or different batches of chips of the same type. The latter conclusion was derived from another chip series for which we observed a reversed relation of the PM/MM-gain, namely a larger value for the P-array compared with the A-array [5] (see also the two A-chips in Figure Figure3).3 Updated probe sets One possible approach to partially level out chip-type specific differences is the matching of the probe sets of different array types using genomic sequence information updated with respect to the original probe set assignment of the manufacturer. Recent studies show that significant percentages of existing GeneChip probe set definitions are no longer consistent with gene and transcript assignments in actual versions of public databases. The probe identity issue is of critical importance, as it significantly affects the expression values summarized on probe set level and thus their interpretation and understanding [30,31]. Dai et al. [30] performed reanalysis of probe and probe set annotations resulting in publicity available, regularly updated probe set definitions for most of the GeneChip-types. A series of probe selection and grouping criteria utilizing the latest sequence and annotation information taken from databases such as REFSEQ or ENSEMBLE (gene, transcript and exon based) are applied. (i) This filtering removes "bad" probes either without or with multiple perfect match hits along the genomic sequence and, (ii) it re-arranges "redundant" probe sets addressing the same gene, transcript or exon into one probe set. The resulting updated probe sets contain variable numbers of probes ranging from four to more than thirty. The mean probe set size is increased for gene- and transcript related sets (e.g., for the HG-U133A array: ENSEMBLE(gene)~14.9; ENSEMBLE(transcript)~13.9; Refsequ~14.9) and decreased for exon-related sets (ENSEMBLE(exon)~9.3) compared with the original Affymetrix set definition (NetAffx~11.1). In Figure Figure99 > of about 2.1 are very similar for all considered cases. This result indicates that the expression degrees of present probe sets located in the mix-, S- and sat-ranges of the original hooks remain, on the average, essentially unchanged after updating the probe sets.The amelioration of the probe sets masks out a certain amount of "bad", i.e. falsely annotated or ambiguous probes and merges redundant probe sets (see above). As a consequence, the fraction of absent probe sets notably decreases from 34% (A-chip) and 50% (P-chip) to about 20% in both cases (see Table 3). The percentage of probe utilization inversely correlates with the reduction of the amount of absent probes detected by the hook method between the original and updated probe sets (see Table 3). For example, about 70% probes of the A-chip but only 50% of the P-chip are used after updating the gene-annotations. The obtained common percentage of absent probe sets of 20% reflects the consistent filtering criteria applied to both chip types. Indeed, the verification of probe sets based on genomic sequence data comes out with similar percentages of modified and not-modified probe sets sharing the same target in the original and updated probe set definitions. In summary, the verification of probe sets increases the amount of the probe sets detected as present ones on one hand. Hence, the hook-calling criterion automatically removes the "bad" probe sets from further analysis. On the other hand, the mean expression degree and the hybridization characteristics reported by the ensemble of probes synthesized on the chip remain virtually unaffected by the redefinition step. Comparison of the updated expression measures of the slightly diverging probe sets shown in Figure Figure88 4. Hybridization control Assessment of data quality is an important component of the analysis pipeline for gene expression microarray experiments. Essentially all steps of RNA-preparation (extraction, amplification, in-vitro transcription, labelling), hybridization, washing and signal detection can have significant effects on the extracted "apparent" expression values seen between different samples with consequences for subsequent downstream applications. There are, for example, "technical" factors associated with the correction for background fluorescence owing to bleed over-effects from surrounding probes on the arrays [32], or to spatial artefacts [33,34]. Another kind of effects are linked with the RNA integrity and the used amplification and labelling protocols [35-39]. In this section we demonstrate the potential of the hook-analysis to detect and to estimate variations of the data owing to RNA-quality, the effect of substitution of cRNA by cDNA and of the labelling protocol. RNA-amplification bias The amplification step of cRNA-preparation uses reverse transcriptase primers starting from the 3' -end of the original mRNA resulting in a population of 3' -biased, truncated transcript fragments. This 3'-overrepresentation gives rise to the systematic lowering of signal-intensities when the position of the probes shifts towards the 5'-end [35,40,41]. Hence, the probes designed for detecting one and the same transcript apparently report a progressively decreasing expression degree with increasing distance from the 3'-end of the transcripts. This is potentially detrimental for the expression value of the probe set summarized from individual probe-level data. To illustrate the consequences of the 3'-biased amplification on the hook-data we ranked each probe in each probe set according to its position from the 3'-end, calculated the Δ- and Σ-coordinates as average value over probes no. #1 – #4 (subset more closely to the 5'-end), #8 – #11 (subset more closely to the 3'-end) and #1 – #11 (total probe set) and presented the hook-plots, the density distributions and the total Σ-coordinates as a function of the "sub-Σ-values", Σsub in Figure Figure10.10
As an example, the figure compares two biological replicates A and B of total RNA prepared from rat muscle hybridized on rat genome RG-230 GeneChip arrays. Before microarray analysis RNA integrity and concentration was examined on an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) using the RNA 6.000 LabChip Kit (Agilent Technologies) according to manufacturers instructions. Quantification of 28S and 18S ribosomal RNA before target amplification using the T7-protocol (see [42,36] and [43] and references cited therein) revealed virtually equal RNA quality from both preparations according to the 28S/18S-ratios of 1.45 (sample A) and 1.43 (B). Figure Figure1010 The total hook and the 3'- and 5'-"subhooks" of each sample are well described by the same theoretical function using a common set of parameters (see middle panel in Figure Figure10).10 . It reflects the larger average strength of specific binding owing to the larger fraction of specific full-length transcripts.The smaller 3'/5'-ratio in the S-range, the smaller expression index and the larger decay constants, λ, of sample B compared with sample A reveal a generally larger fraction of specific transcripts due to more complete amplification and thus a better RNA-quality. The hook-analysis also reveals that the larger fraction of full length transcripts in sample B is accompanied by a slightly smaller width of the hook and a smaller fraction of absent probes (see middle panel of Figure Figure10).10 The narrowing of the hook upon improvement of RNA quality, indicates a larger relative amount of non-specific binding. This trend seems peculiar because one might expect that the larger amount of specific transcripts reduces the amount of non-specific binding. However, the more efficient amplification step in sample B results in a higher total number of full length transcripts and/or in a larger binding constant for non-specific binding and thus in an increased binding strength of non-specific binding which, in turn, gives rise to the increased level of cross-hybridization as indicated by the slightly narrower hook-curve. Note however that the decreased quality of the RNA-amplification only weakly shifts the rising branch of the hook curve, in contrast to the overall dilution effect shown in Figure Figure55 Microarrays of the GenChip-design contain special probe sets for estimating the 3'/5'-amplification bias. They refer to relatively long transcripts such as β-actin and GADPH with probe sets targeting the transcription of their 3'-, mid- (m), and 5'-regions. Small 3'/5'-signal ratios are generally thought to indicate small amplification bias and thus good amplification quality. Figure Figure1111
In summary, the 3'/5'-ratio of the respective control probe sets are obviously insufficient for judging the amplification bias because non-specific hybridization keeps the signal of the 5' probe set at the same level as that of the 3' probe set which misleadingly pretends good amplification quality. Consideration of the hook-coordinates of these probes and, more reliably, analysis of 3'-biased "sub-hooks" enables the separation of the N- and S-hybridization ranges and this way a clear identification of the 3'/5'-amplification bias. Tissue specific RNA quality and normalization of microarray data Measurement of gene expression is based on the assumption that an analyzed RNA sample closely represents the amount of transcripts in vivo. Transcripts show stability differences of up to several orders of magnitude raising the possibility that partial degradation during cell lysis and sample preparation causes a transcript-specific bias in the expression measures in addition to the amplification bias discussed in the previous section [37]. Different RNA quality measures, such as the 28S/18S ratio, the RNA integrity number (RIN) or a degradometer-score have been developed, verified (see [43] and references cited therein for an overview) and related to different microarray hybridization characteristics [36,39,42]. It was shown that the decrease in RNA integrity is often paralleled by the decrease of the percentage of present calls [37,39] which implies the reduction of the expression degree for degraded transcripts. Other studies however reveal more puzzling results, either with virtually no effects of degradation on expression or with opposite correlations between RNA-quality and weak and strong signals where the former ones increase and the latter signals decrease the worse the RNA becomes [38]. The integrity of the RNA extracted from different tissues systematically depends, among other factors, on the type of the tissue possibly and partly because of variations of the content and the activity of ribonucleases [37,39]. Estimation of RNA-quality and, if possible, appropriate correction for tissue-specific biases are thus essential steps in establishing tissue-specific expression profiles. In Figure Figure1212
For more detailed analysis we select two samples with relatively large and small percentages of absent probe sets, the RNA of which were extracted from superior cervical ganglion cells (scg) and from periphal blood/dentritic cells (dc) (see arrows in Figure 12a The different shapes of the hook curves cannot be explained by a smaller amount of RNA (e.g. due to a smaller yield of cRNA synthesis), less-efficient labelling and/or suboptimal calibration of the scanner. In these cases one expects the shift of the "whole" hooks without considerable change of their width and decay of the density distribution (compare, e.g. with Figure Figure5,5 The hook-coordinates of selected probe sets are highlighted by symbols in Figure Figure1212 In parts e and f of Figure Figure1212 These trends partly explain the puzzling results of a recent correlation analysis between signal intensities and the degree of RNA-degradation [38]: Our data show, that, on one hand, degradation of RNA increases the non-specific background level with the consequence that the intensities of probes with small specific signal contributions effectively increase. On the other hand, the specific binding strength decreases upon RNA-degradation with the consequence that the signals of strongly expressed signals decrease. The former effect mainly affects weak intensities whereas the latter effect is more relevant for stronger total signals. Both opposite effects contribute to the intensity of each probe with specific weights giving rise to increased, decreased or even unchanged total signals. In part e – g of Figure Figure1212 These qualitative discrepancies between both approaches uncover a fundamental problem of microarray normalization with no satisfactory solution yet (see, e.g., [45]). Note that in their analysis the authors used MAS5 together with global median normalization of the raw intensities [44]. The vertical bars in part b of Figure Figure1212 Labelling protocol In addition to the quality of start-RNA and the amplification bias there are other methodological differences such as the labelling reaction that can introduce systematic biases. Figure Figure1313
The sensitivity profiles of the N-hybridization range are very similar for both labelling protocols with differences of less than 20% of the respective sensitivity value. Similar results were reported previously by using either Biotin-UTP or Biotin-CTP [47]. The sensitivity terms additively decompose into "binding-"contributions related to the effective free energy of the respective base pairing; and into a fluorescence contribution taking into account base-specific labelling [20]. Labelling is expected to decrease the binding contribution (because the bulky label disturbs the base-base interactions) and to increase the fluorescence contribution [19,20]. The obtained positional dependent sensitivity profiles reveal that, if at all, labelling has only little effect. On the other hand, the width of the hook curve and the decay constant of the density distribution for the Affy-protocol slightly exceed the respective values for the Enzo-labelling at identical percentages of absent probes (~33%) and at identical optical background levels in both preparations. The observed differences indicate the slightly smaller amount of non-specific binding and the stronger specific binding of the former preparation. Hence, the Affy-protocol slightly better performs then the previous Enzo-labeling because it reduces the non-specific background level and increases the effective binding strength for specific binding; this way, giving rise to both, a better specificity and sensitivity of the method [26] in agreement with the results of special benchmark experiments [46]. The molecular origin of the observed differences is presently not clear and requires further analyses. Note however that the Enzo-protocol introduces a significantly higher fraction of biotinylated nucleotides with potentially deteriorated binding affinities which provides a tentative explanation of the observed trends. The stronger specific binding caused by the Affy-protocol is paralleled by stronger saturation effects at high intensities which, in turn, give rise to systematic differences between the S-sensitivity profiles of both preparations: The profiles of cytosine (C) and guanine (G) shift systematically towards smaller sensitivities whereas the T- and especially the A-profiles shift into the opposite direction. This vertical "compression" of the profiles was previously observed [20]. It reflects the fact that stronger base Watson-Crick pairings of the C- and G-nucleotides are, on the average over all probes, more affected by saturation than pairing of the T and especially A which form weaker bonds. Note also that the saturation effect is much smaller for the MM as expected. These results reveal that the hook-algorithm only incompletely corrects the individual probe intensities for saturation effects probably because the intensity asymptote upon complete saturation is not a chip constant but a sequence- and thus probe-specific property owing to washing effects [29,48]. Replacing RNA targets with DNA Microarray technology takes advantage of either of two types of chemical entities as the labelled target, RNA or DNA, considered to be virtually equivalent for the purpose of expression analysis. RNA is usually hybridized on "conventional" expression arrays whereas especially newer GeneChip generations such as exon- and tiling-expression arrays as well as genomic SNP- and re-sequencing-arrays use DNA-targets. Figure Figure1414 Inspection of the hook-curves reveals several effects caused by the substitution of RNA by DNA: Firstly, the sensitivity correction to a much less extent affects the hook-curve of the DNA-hybridization (compare the corrected and raw hooks). For example, the width of the N-range of the raw RNA-hook (ΔΣ(N) ≈ 0.7) considerably exceeds that of the respective DNA-hook (ΔΣ(N) ≈ 0.3) whereas after correction the N-widths shrink to virtually identical values in both cases (ΔΣ(N) ≈ 0.2). Secondly, DNA/DNA hybridisation shifts the whole hook, and especially the background level, to smaller abscissa values indicating a smaller mean intensity level; thirdly, substitution of RNA by DNA slightly increases the width of the hook (β) and the decay constant of the density distribution in the S-range (λ); and fourthly, it slightly reduces the vertical dimension of the hook (α). Moreover, also the sensitivity profiles indicate characteristic differences: Especially, the profiles for Guanine (G) provide the largest contributions for DNA-binding to DNA-probes whereas the Cytosine-profiles are the largest in most cases for RNA-binding. The different target-entities give rise to D(NA)/R(NA)- and D/D-base pairings in the target/probe-duplexes and to R/R- and D/D-interactions for bulk duplexing of the targets in solution. The thermodynamic stability of specific 27 meric oligomer-duplexes was found to follow the order D/D < D/R < R/R with free energy ratios (37°C) of ΔG(D/D)/ΔG(D/R) ≈ 0.9 and ΔG(R/R)/ΔG(D/R) ≈ 1.3 [50]. Note that the PM/MM-gain α ≈ log(s) approximately refers to the free energy increment of one Watson-Crick pairing in 25 meric probe/target duplexes if one neglects the specific mismatch contribution. The decreased PM/MM-gain (α) of the DNA-hybridization thus corresponds to the weaker association of D/D -versus – D/R where the ratio α (D/D)/α (D/R) ≈ 0.85 ± 0.05 roughly agrees with the expected free energy ratio. The slightly larger width of the DNA-hook indicates the smaller non-specific binding strength of the D/D-duplexes. This difference and the larger variability of the RNA-hybridization were attributed to relatively-stable, mismatched "G•u-wobble" base pairings in the non-specific R/D-duplexes (the lower case letter refers to the target, the upper case letter to the probe) which give rise to less specific binding and stronger scattering of the background compared with D/D hybridizations without such relatively-stable mismatched pairings [49]. The latter D/D-hybridization is consequently more specific than the R/D-hybridization as indicated by the larger decay constant (see Figure Figure14)14 Also the sensitivity profiles indicate systematic differences of base-pair interactions in both hybridizations. Particularly, the relative values of the G- and A-profiles for the D/D-duplexes are considerably larger than that for the D/R-duplexes. Exactly this trend is expected from the relative interaction strength of canonical Watson-Crick pairings in the respective duplexes: D/D-pairings are symmetrical with respect to "bond-reversals" (i.e. C•g≈ G•c > A•t≈ T•a) in contrast to "unsymmetrical" D/R-interactions (C•g > G•c≈ T•a > A•u) [19,20,50-52]. Hence, for D/D-duplexes one expects the relative enhancement of the G and A sensitivity terms compared with those in the D/R duplexes in agreement with the observed profiles. Note however that the sensitivity profiles refer to effective binding strengths which include surface and bulk interactions as well [26,53]. Such effects give rise to specific differences between the S- and N-profiles especially of the RNA-preparation which implies relative strong R/R-interactions in the respective bulk duplexes. 4. Summary and Conclusion We presented a new method of microarray data analysis based on a physical model. This so-called hook method pre-processes the raw intensity data for further downstream analyses on one hand, and, on the other hand, provides chip characteristics with potential applications in hybridization quality control and array normalization. In this publication we illustrate the diagnostic potential of the hook-method by means of different chip- and transcript-related characteristics in various situations: - Using the data of spiked-in and dilution experiments it was shown that our single-chip approach provides accurate and precise expression measures over three orders of magnitude in units of the specific binding strength of the transcripts. The correction for saturation and probe-specific non-specific background assures linearity between the input (transcript concentration) and output (expression degree) measures. Among the four alternative measures, PMonly and PM-MM-difference measures perform best, but also the measure extracted from the S/N-ratio provides satisfactory results. - The "present/absent"-concept of detection calls originally introduced by Affymetrix provides straightforward, simple and helpful information which relates the signal of each transcript to the detection limit of the particular hybridization and, in addition characterizes the mean "presence" of transcripts in the hybridization solution. The hook-method calculates an analogous measure based on the break-criterion reflecting the onset of specific hybridization. This criterion implicitly takes into account the different correlations between the PM and MM probes upon non-specific and specific hybridization and thus it "dynamically" adapts to each particular hybridization. We have shown that this criterion well classifies into present and absent transcripts using data taken from the two-species yeast 2.0 array and from the golden spike experiment with known batches of "empty" probes. - The hook method performs reasonably well by comparing expression data of the same origin between two chip generations (HG-U133A and HG-U133 plus 2.0). The hook-diagnosis suggests that subtle differences of the hybridization law due to details of chip-manufacturing and/or -handling upon preparation give rise to slightly biased expression data between different array types and/or different batches of chips of the same type. - The re-assembly and filtering of probe sets based on improved genomic information increases the amount of probe sets detected as present ones. This result in turn shows that the hook-calling criterion applied to the original probe set definitions partly removes the "bad" (because of inconsistent probe assignments) probe sets from further analysis. The mean hybridization characteristics remain virtually unaffected by the redefinition step of the probe sets. The consequences of probe set-updating for the expression measures on transcript level will be studied separately. - The effect of 3'-biased RNA amplification gives rise to the progressively decreased specific hybridization of probes with increasing distance of their position relative to the 3'-end of the transcript which can be detected by hook-analysis using appropriate subsets of probes nearer to the 3'- and the 5'-end, respectively. This analysis properly differentiates between specific and non-specific hybridization where the latter one is, per definition, not affected by the 3'-biased intensity effect. Our data show that overall 3'/5'-signal ratios not considering the difference between specific and non-specific binding can lead to misinterpretations of the amplification bias. - Hook analysis reveals detailed insights into consequences of tissue-specific RNA-quality differences on hybridization and expression measures. Degradation of RNA increases the fraction of absent probes paralleled by the decrease of the specific binding strength and counterbalanced by the increase of non-specific background hybridization. Improper separation of both opposite effects can pretend expression changes into the wrong direction. We suggest that the chip characteristics provided by the hook method can serve as calibration benchmarks for alternative normalization algorithms which take into account the different behaviour of the specific and non-specific signal in samples of varying RNA-quality. - The variation of the labelling protocol and substitution of RNA-targets by DNA modifies the probe/target interactions. Hook analysis shows for example that DNA-targets, and to a smaller degree, the Affy-labeling protocol (no labelling of cytosines) improve the specificity of the method compared with RNA-targets and the previous ENZO-protocol, respectively. For DNA-targets the sequence correction is of much smaller impact because of smaller sequence-induced variability of the raw intensities. In summary, sequence correction and especially the quantification of the non-specific background contribution for each probe enable subtle diagnosis of the hybridization on each array. To extract this information the hook method combines the intensities of each PM/MM-probe pair and utilizes the different properties of both probe types. Here the MM behave like "weak-affine" PM and serve as intrinsic reference for the PM over the whole potential concentration range of the transcripts. We illustrated that this intrinsic referencing might be extremely useful for dealing with practical issues of expression analysis such as RNA-quality, hybridization control and calibration of expression measures. This publication outlined several potential applications of the method which will be addressed in our future work. Competing interests The authors declare that they have no competing interests. Authors' contributions HB designed and leads the project, carried out most of the analyses and wrote the paper. SP wrote the computer program for hook analysis and helped to draft the paper. KK added experimental expertise and helped to draft the paper. All authors read and approved the final manuscript. Acknowledgements The work was supported by the Deutsche Forschungsgemeinschaft under grant no. BIZ 6-1/4 and by grants from the Interdisciplinary Centre for Clinical Research at the Faculty of Medicine of the University of Leipzig (project Z03 to K.K.). SP thanks the International Max Planck Research School for Molecular Cell Biology and Bioengineering (IMPRS-MCBB) Dresden for funding. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||||
Genome Biol. 2005; 6(2):R16.
[Genome Biol. 2005]Genome Biol. 2005; 6(2):R16.
[Genome Biol. 2005]Algorithms Mol Biol. 2008 Aug 29; 3():12.
[Algorithms Mol Biol. 2008]Algorithms Mol Biol. 2008 Aug 29; 3():12.
[Algorithms Mol Biol. 2008]J Clin Endocrinol Metab. 2006 May; 91(5):1934-42.
[J Clin Endocrinol Metab. 2006]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]J Clin Endocrinol Metab. 2006 May; 91(5):1934-42.
[J Clin Endocrinol Metab. 2006]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]Proc Natl Acad Sci U S A. 2006 Oct 31; 103(44):16254-9.
[Proc Natl Acad Sci U S A. 2006]Proc Natl Acad Sci U S A. 2006 Oct 31; 103(44):16254-9.
[Proc Natl Acad Sci U S A. 2006]Genome Biol. 2005; 6(2):R16.
[Genome Biol. 2005]Algorithms Mol Biol. 2008 Aug 29; 3():12.
[Algorithms Mol Biol. 2008]Biophys J. 2005 Jul; 89(1):337-52.
[Biophys J. 2005]Nucleic Acids Res. 2003 Feb 15; 31(4):e15.
[Nucleic Acids Res. 2003]Biophys J. 2005 Jul; 89(1):337-52.
[Biophys J. 2005]Langmuir. 2005 Sep 27; 21(20):9287-302.
[Langmuir. 2005]Nat Biotechnol. 2006 Sep; 24(9):1151-61.
[Nat Biotechnol. 2006]BMC Bioinformatics. 2007 Nov 15; 8():449.
[BMC Bioinformatics. 2007]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]Nucleic Acids Res. 2005 Nov 10; 33(20):e175.
[Nucleic Acids Res. 2005]BMC Genomics. 2006 Jun 15; 7():153.
[BMC Genomics. 2006]Nucleic Acids Res. 2005 Nov 10; 33(20):e175.
[Nucleic Acids Res. 2005]BMC Genomics. 2007 Oct 16; 8():373.
[BMC Genomics. 2007]Nucleic Acids Res. 2005 Nov 10; 33(20):e175.
[Nucleic Acids Res. 2005]BMC Bioinformatics. 2007 Feb 8; 8():48.
[BMC Bioinformatics. 2007]BMC Bioinformatics. 2006 Jan 23; 7():35.
[BMC Bioinformatics. 2006]Biotechniques. 2004 Mar; 36(3):498-506.
[Biotechniques. 2004]Biotechniques. 2004 Mar; 36(3):498-506.
[Biotechniques. 2004]BMC Genomics. 2003 Nov 10; 4(1):44.
[BMC Genomics. 2003]Nat Genet. 2003 Dec; 35(4):292-3.
[Nat Genet. 2003]J Biotechnol. 2007 Jan 20; 127(4):549-59.
[J Biotechnol. 2007]Mol Aspects Med. 2006 Apr-Jun; 27(2-3):126-39.
[Mol Aspects Med. 2006]Mol Aspects Med. 2006 Apr-Jun; 27(2-3):126-39.
[Mol Aspects Med. 2006]J Biotechnol. 2007 Jan 20; 127(4):549-59.
[J Biotechnol. 2007]Nat Genet. 2003 Dec; 35(4):292-3.
[Nat Genet. 2003]BMC Genomics. 2008 Feb 25; 9():91.
[BMC Genomics. 2008]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]BMC Genomics. 2008 Feb 25; 9():91.
[BMC Genomics. 2008]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Genome Biol. 2006; 7(8):404.
[Genome Biol. 2006]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Bioinformatics. 2007 Aug 15; 23(16):2088-95.
[Bioinformatics. 2007]Langmuir. 2005 Sep 27; 21(20):9287-302.
[Langmuir. 2005]Langmuir. 2005 Sep 27; 21(20):9287-302.
[Langmuir. 2005]Nucleic Acids Res. 2006 May 24; 34(9):e70.
[Nucleic Acids Res. 2006]Nat Biotechnol. 2006 Sep; 24(9):1071-3.
[Nat Biotechnol. 2006]Nat Biotechnol. 2006 Sep; 24(9):1071-3.
[Nat Biotechnol. 2006]Biophys Chem. 2000 Jul 31; 86(1):37-47.
[Biophys Chem. 2000]Nat Biotechnol. 2006 Sep; 24(9):1071-3.
[Nat Biotechnol. 2006]Langmuir. 2005 Sep 27; 21(20):9287-302.
[Langmuir. 2005]Biophys Chem. 2000 Jul 31; 86(1):37-47.
[Biophys Chem. 2000]Eur J Biochem. 2002 Jun; 269(12):2821-30.
[Eur J Biochem. 2002]