![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||
Copyright © The Author 2005. Published by Oxford University Press. All rights reserved Genome-wide estimation of transcript concentrations from spotted cDNA microarray data 1Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, Norway 2Department of Mathematics, University of Oslo, Norway 3Norwegian Computing Center, Oslo, Norway 4Department of Mathematics and Computer Science, Technische Universiteit Eindhoven, The Netherlands 5Department of Radiation Biology, Health Enterprise Rikshospitalet-Radiumhospitalet, Oslo, Norway *To whom correspondence should be addressed at Department of Biostatistics, University of Oslo, PO BOX 1122 Blindern, N-0317 Oslo, Norway. Tel: +4722851004; Fax: +4722851313; Email: frigessi/at/medisin.uio.no Received May 31, 2005; Revised August 28, 2005; Accepted August 28, 2005. The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions/at/oxfordjournals.org This article has been cited by other articles in PMC.Abstract A method providing absolute transcript concentrations from spotted microarray intensity data is presented. Number of transcripts per µg total RNA, mRNA or per cell, are obtained for each gene, enabling comparisons of transcript levels within and between tissues. The method is based on Bayesian statistical modelling incorporating available information about the experiment from target preparation to image analysis, leading to realistically large confidence intervals for estimated concentrations. The method was validated in experiments using transcripts at known concentrations, showing accuracy and reproducibility of estimated concentrations, which were also in excellent agreement with results from quantitative real-time PCR. We determined the concentration for 10 157 genes in cervix cancers and a pool of cancer cell lines and found values in the range of 105–1010 transcripts per µg total RNA. The precision of our estimates was sufficiently high to detect significant concentration differences between two tumours and between different genes within the same tumour, comparisons that are not possible with standard intensity ratios. Our method can be used to explore the regulation of pathways and to develop individualized therapies, based on absolute transcript concentrations. It can be applied broadly, facilitating the construction of the transcriptome, continuously updating it by integrating future data. INTRODUCTION Recent developments in molecular techniques, such as serial analysis of gene expression (SAGE), massive parallel signature sequencing (MPSS) and microarray technology, have opened for genome-wide exploration of the transcriptome (1–3). Such data increase our understanding of complex biological processes and diseases and are becoming useful in the design of molecular therapies (4). SAGE and MPSS provide quantitative and comparable measures of the transcript abundance, whose universality allows for integration into future studies. The complexitity of SAGE and MPSS has, however, limited their utility (5). Efficient production of spotted glass-slide arrays has made the microarray technology to a widespread technique that is more suitable for high-throughput analysis. The technique has provided valuable information on the relative transcript levels in tissues, but differences in experimental protocols and normalization methods make direct comparison of datasets between microarray studies very difficult (6). Improved methods to extract useful information from such data that lead to absolute rather than relative transcript concentrations would be of high value (6–8), facilitating the building up of an universal transcript database. This is the goal of several public data repositories, including, for example, the Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/projects/geo/) and SAGEmap (http://sagemap.wr.usgs.gov/index.asp). Extraction of absolute transcript levels from spotted microarray data is complicated owing to significant experimental variation and noise originating in the production and hybridization processes (7–9). The use of probes with different length and base composition, leading to differences in hybridization efficiency between probes, makes assessment of absolute levels difficult. Most analyses are based on intensity ratios between two biological samples, hybridized together in a single experiment. Normalization of the ratios reduces the influence of systematic effects, though absolute levels are lost as well as possibly important biological information (10–12). Analysis based on intensities per se rather than ratios opens for calculating accurate transcript levels. We have developed a model based on a new principle that enables estimation of absolute transcript levels on a genome-wide scale by extended exploitation of microarray data. Once the concentrations have been estimated, new analyses are possible, including within sample comparison, merging of datasets with a design lacking connectivity or based on amplified and non-amplified starting materials, cross-platform and cross-species comparisons and more general meta-analyses. The technique was thoroughly validated on datasets with known mRNA concentrations. Moreover, we estimated the transcript concentrations of 10 157 genes and expressed sequence tags (ESTs) in 12 cervix cancers and a pool of 10 human cancer cell lines, and found values consistent with quantitative real-time PCR (qRT-PCR) data and with previously publised data (13). We generated new views into the transcriptome, by comparing transcript abundance between genes or groups of genes within a population. The model follows the different steps of the microarray experiment, incorporating information associated with array, cDNA synthesis, hybridization and scanning characteristics. We computed the joint posterior distributions of the absolute transcript levels of all genes, describing dependencies between genes, both within and between individual samples. Uncertainties from sample preparation to imaging were coherently propagated in a global statistical approach, leading to realistically large confidence intervals around estimated concentrations. Few methods quantifying transcript concentrations from spotted microarray data have been developed so far. The approach proposed by Dudley et al. (14) requires hybridization of each sample with a reference of known concentration. Other methods rely on calibration of each array with additional techniques (15). The present method is the first quantifying absolute transcript levels from spotted microarray data without the need for calibration of each sample or gene individually. There are a few quantitative methods based on in situ synthesized arrays (16,17) and, notably, (18) which takes an empirical Bayesian approach, but the data produced from them are scarce, probably because of a limited access to such arrays. The possibility to directly use the spotted microarray technology for the estimation of absolute transcript concentrations opens for a more comprehensive generation of transcript databases. Results reported here were based on spotted cDNA microarrays, which feature particularly large experimental variation. Our technique can also be directly applied to spotted oligoarrays and can handle experiments based on amplified as well as non-amplified material. MATERIALS AND METHODS Principles The idea is to follow conceptually the mRNA molecules through the microarray experiment, from cDNA synthesis to hybridization and subsequent washing (Figure 1
Basic data are the average fluorescence intensities, background corrected or not, and standard deviations of each spot on the microarray slide. Intensities should be within the linear range of the scanner, and saturated intensities should either be excluded or corrected (19). No transformation nor normalization are done. Non-connected datasets are allowed as long as the design includes at least one loop, like a self–self or dye–swap hybridization. About 50 genes must be spotted at least in duplicates, their number being independent of the number of genes in the analysis, but related to identifiability of probe and pen effects. Our method succeeds in obtaining absolute concentrations because it makes explicit use of probe and spot related covariates like probe length and quantity, to describe probe-dependent hybridization efficiency. By means of duplicate spotting, we have many transcripts with more than one probe, and the effect of probe-dependent covariates can be estimated and further incorporated into the estimates of concentrations for genes spotted only once. Experiments with amplified material are handled like those with non-amplified ones, but estimates are transformed back to original scale. The model is simple and natural. We performed Bayesian estimation of its parameters and calculated the posterior joint distribution of all absolute transcript concentrations using Markov Chain Monte Carlo (MCMC; http://www.statslab.cam.ac.uk/~mcmc/) (20). This distribution reflects biological variation of and dependencies between the numbers of transcripts. Based on this distribution we estimated the number of transcripts for each gene in each sample together with their uncertainty, described by 95% credibility intervals. The values were given in terms of number of transcripts per µg of total RNA, mRNA or per cell, depending on the experimental protocol. Covariates The steps of the microarray experiment were modelled as a binomial selection process, using covariates associated with cDNA synthesis, dye labelling, purification, hybridization and washing (Figure 1 Covariates associated with scanning were dye, photo multiplier tube (PMT) voltage and the scanner amplification factor. The dye covariate represented the dye effect in both labelling and scanning. The amplification factor was a measure of the increase in intensity per unit of increase in PMT voltage. The factor was determined once for each dye and scanner as the slope in log-linear plots of intensity versus PMT voltage (19). A covariate associated with image analysis was the hybridization factor, used to scale the estimated values to the true number of transcripts. It was determined with weighted linear regression of estimates versus true values in a dataset based on samples with known transcripts concentrations. Such control samples in general show a more efficient cDNA synthesis and dye labelling than biological samples, owing to the high purity of these molecules. The samples are therefore not useful for calibration of intensities. However, after cDNA synthesis and labelling, the binding behaviour to the array slides of cDNA synthesized from the control samples resembles that of the cDNA from biological samples. This step is not dependent on the quality of the applied mRNAs, justifying the use of control samples for validation and determination of the hybridization factor. Under ordinary stable experimental settings it is sufficient to determine this factor once for each hybridization machine. Statistical methods The known quantity of material for sample t on array a is denoted as qt,a, e.g. the weight of mRNA after amplification or of total RNA, as in our study on cervix cancer. For each gene g, let In (i), the
In (ii), the expected scanned intensity on spot s, array a, is modelled as
In (iii), we assume for the pixel-wise intensity measurement
In the statistical analysis of several arrays and samples, many of the unknown parameters are shared, like array, dye, pen, gene and probe related effects; all data involving sample t contribute information on the unknowns Microarray experiments Microarray slides produced at the Microarray Facility at Health Enterprise Rikshospitalet-Radiumhospitalet were used. The slides contained 18 432 spots printed with 32 pens. The probes were human cDNAs of known genes and ESTs, selected from the Research Genetics 40K I.M.A.G.E. clone selection (Invitrogen). Probe lenth ranged from 525 to >2000 bp. A total of 17 DNA control probes (Lucidea Universal ScoreCard, Amersham Biosciences) were printed in equal amounts on six subarrays. These control spots were used for validation of our method and for determination of the hybridization factor optimal for the experiments on cancers and cell lines. Samples from 12 cervix cancers (FIGO stages 2b–4a) and a pool of 10 cancer cell lines, originating from mammary gland and cervix adenocarcinoma, liver hepatoblastoma, testis embryonal carcinoma, glioblastoma, melanoma, liposarcoma, lymphoma, B-lymphocyte myeloma and T lymphoblast leukaemia, were analysed. Total RNA was isolated from the tumours by use of Trizol reagent (Life Technologies), whereas total RNA from the cell lines was commercially available (Stratagene). Labelled cDNA was synthesized from 20 µg total RNA using Superscript II transcriptase (Life technologies) and Fairplay Microarray Labeling kit (Stratagene) in the presence of either Cy3-dUTP or Cy5-dUTP (Amersham Pharmacia). Each tumour sample was co-hybridized with the cell line sample in a dye–swap design, yielding totally 24 microarrays analysed jointly. Two control samples, each containing 17 different mRNA sequences pre-mixed at specific concentrations, were included in the hybridization mixture for analysis of the control spots (Lucidea Universal ScoreCard, Amersham Biosciences). A total of 0.5 µl of each control sample was used, corresponding to a number of transcripts in the range of 5.8 × 105–5.8 × 109. RNA purity was optimal and equal for all samples and was therefore currently not used in our model. The slides were imaged at a resolution of 10 µm using an Agilent G2565BA scanner (Agilent Technologies). The laser power and the PMT voltage were 100%. Saturated spot intensities were corrected as described previously (19). Spot and background intensities were quantified using the GenePix 4.1 image analysis software (Axon Instruments). Bad spots, regions with high unspecific binding of dye and weak spots that were not automatically detected by the software were filtered out and excluded from the further analysis. The background signal was very low for control spots, and hence no background correction of intensities was necessary. For other spots, we performed analysis for intensities both background corrected (background subtraction) and not. A detailed description of materials is given in Supplementary Data 4. Quantitative real-time PCR The estimated transcript concentrations of eight genes that covered the whole concentration range were compared with data obtained by use of qRT-PCR. The TaqMan PCR system (Applied Biosystems) with a 7500 Sequence Detector (Perkin-Elmer) was used. cDNA was synthesized from 2 µg of the total RNA used in the microarray experiments by use of Superscript II transcriptase (Life Technologies). Pre-designed, gene-specific TaqMan probe and primer sets (Applied Biosystems), consisting of a specific fluorogenic probe and a pair of oligonucleotides, were used to run standard qPCRs for CDK4, CTNNB1, HK2, MYC, CSTA, PPT2, CCND1 and PDK2 (Supplementary Data 5). We employed 1 ng cDNA for all but the low abundant genes CCDN1 and PDK2, 10 ng cDNA were used, to increase the signal. The reactions were carried out in triplicate in a 25 µl reaction volume and a 96-well plate format. The transcript concentration of each gene was calculated using the standard curve method and presented relative to the expression of TBP, which served as an internal, endogenous control (22). RESULTS Validation of the methodology Two different dye–swap experiments using samples with known transcript concentrations bound to the control spots of our arrays were used for validation of our method. The spot intensities covered the whole detection range, from near background values to saturation. There was a highly linear relationship between estimates and true numbers of transcripts in double logarithmic plots (Figure 2
We used the hybridization factor c = 4.31 × 10−10 determined from experiment in Figure 2A We performed a further validation of our method by comparing the transcript concentrations of eight genes in cancer cell lines and primary tumours with corresponding data achieved with qRT–PCR. The transcript concentrations of 10 157 genes and ESTs in 12 cervical tumours and in a pool of 10 cancer cell lines were determined. Eight genes covering the whole concentration range were selected for qRT–PCR. We found a clear and strong correspondence between the qRT–PCR data and the transcript concentrations determined with our method (Figure 3A
Standard log-ratio expressions based on normalized intensities of tumour and cell line samples also showed a significant correlation to qRT–PCR data (Figure 3B Transcript concentrations in cancer cell lines and cervix tumours To test our model we analysed further the transcript concentration of 10 157 genes and ESTs determined for 12 cervical tumours and a pool of 10 cancer cell lines. We listed estimated concentrations for each gene and for each tumour, equipped with its 95% confidence interval in a table available at http://www.nr.no/pages/samba/area_emr_smbi_transcount. Similarly for the cancer cell lines, we reported estimated concentrations and confidence intervals for each gene in a second table, which also includes the mean concentrations of the cervix tumours. This is also available at the same web site. Concentrations are reported based on intensities both background corrected and not. Background corrected concentrations are systematically slightly smaller than not-background corrected ones, the difference being minimal and of interest only for low concentrations <3 × 106. The transcript distributions were skewed, with a heavy tail towards higher concentrations for both the cell lines and tumours (Figure 4
Transcript concentrations in the cell lines were estimated precisely, since the sample was hybridized 24 times. Much more data were available her than for the 12 tumour samples, each hybridized only twice. Reported 95% credibility intervals for each gene in the cell lines represent the precision achievable with our technique, given this level of replication of the experiment. Credibility intervals for each of the 12 cervix tumours depict the uncertainty of our estimates based on a single dye–swap. The accuracy of these estimated concentrations depends on the number of spots available for each gene. The standard deviation of the log-concentration for each of the 10157 genes and ESTs in the 12 cervix tumours ranged between 3.6 and 9.82. The corresponding coefficient of variation ranged between 0.53 and 1.059. There were only 217 genes with a coefficient of variation >1. This shows that our confidence intervals have a good precision. Biological variability between the 12 cervix tumours is described by the spread of the estimated concentrations. These could differ with a factor 10–100 between tumours (Figure 5
Our method allows us to compare the transcript concentration of different genes within a single sample. The accuracy was high enough for detection of significant differences within individual tumours, differences that were consistent with qRT-PCR data. This is exemplified in Figure 7
DISCUSSION We have developed a method for estimating precisely the transcript concentration of individual genes directly from spotted microarray intensity data. The method allows to compare concentrations of different transcripts within and between single tumours, which opens for new insight into transcript dosage and pathway regulation. In contrast, standard ratio estimation only allows comparison of the same transcript between two tumour samples. The method is generally applicable, since the information about each microarray experiment required to achieve satisfactorily accurate estimates is easily available, though currently rarely made public. We encourage experimenters to make all data describing the experimental procedure available, submitting both channels separately, scanning parameters and measures of probe quantity to public repositories for microarray data like Arrayexpress (http://www.ebi.ac.uk/arrayexpress/), GEO and Cibex (http://cibex.nig.ac.jp/index.jsp). Since spotted microarrays are a widespread technology, our method will then facilitate the introduction of novel approaches to the study of the transcriptome. Our method is based on four main ideas: we incorporate an extended number of covariates compared with other models (7); we treat unequal number of replicates per gene; we use the binomial process, which better depicts experimental dynamics and allows for estimation of the critical parameters β0, βg and αCy3/αCy5; and we avoid normalization and imputation of missing values and build a bottom-to-top coherent stochastic model, fully propagating uncertainty. These elements were crucial for achieving reliable estimates of transcript concentrations. In datasets based on known transcript concentrations we demonstrated a high accuracy of our estimates, especially at intermediate concentrations. Our results were better than in Dudley et al. (14), which reported a significant discrepancy between estimated and true concentrations both at intermediate and lower levels. The accuracy was in fact comparable with that achieved from methods based on in situ synthesized arrays (16,17), despite this technology uses standardized manufacturing and hybridization, so that probe specific biases are highly reproducible and predictable (25). Moreover, in datasets based on cervix tumours and cancer cell lines we found concentrations ranging from 105 to 1010 transcripts per µg total RNA. Assuming an RNA content of ~1 µg per 105 cells (26), this corresponds to a range of 1–105 transcripts per cell. Previously published data of transcript numbers in humans are scarce. Zhang et al. (27) reported numbers ranging from 1 to 5300 per cell in human cancers, as determined by SAGE. However, higher numbers, up to 30 000 transcripts per cell, have been reported for genes in the mouse liver (13,28), making the number of 105, estimated for the highly abundant genes in our study, plausible. The skewed form of the transcript distributions observed in our work is also in agreement with earlier reports (27), and may be caused by underestimation at low concentrations, since the corresponding weak spots on the array are more frequently excluded than the bright ones. Comparison of our estimates and standard log-ratios with qRT-PCR data for a limited number of genes suggested that our estimates were more reliable than the ratios in reflecting the transcript levels. Discrepancies between microarray and PCR data have been reported with worries in previous studies (29). Our approach overcomes some of these difficulties, extracting information from microarray data that are more compatible with PCR results. Note that standard qRT-PCR itself does not provide accurate absolute expression levels (30). The accuracy of our method was high enough to detect significant differences in the transcript concentration within individual tumours as well as between tumours, differences that were consistent with qRT-PCR data. Our estimates therefore reliably revealed true transcript concentrations with satisfactory precision. There are limitations of our methodology. Cross-hybridization and unspecific binding are not taken into account, and possible splice-variants for some of the genes or degree of homology between probe sequence and RefSeq sequence have not been considered. Currently, no analysis tools for microarray data are addressing these aspects. Other covariates could easily be included in our model when available, such as target length and labelling efficiency, probably leading to higher accuracy in the estimates. Of importance is also the slow convergence of the currently implemented MCMC algorithm. Results can require up to a few days of computation time. Our software may require some ad hoc implementation, specific to new datasets and covariates. A major advantage of our model is that it can be directly applied on one or multi-colour experiments and on data from spotted oligoarrays, using base composition of the probes as covariates rather than the probe length. Moreover, the hierarchical structure enables integration of biological information about the samples, such as patient survival data, and known interactions between genes, in a coherent Bayesian setting. If the mRNA weight is not available, and significant variability in the proportion of mRNA in total RNA is suspected, or if the hybridization factor is not available, it is possible to scale each sample so that the sum of estimated transcripts are equal. Direct comparison of such scaled concentrations is still possible between and within samples, but the interpretation as absolute concentrations is lost. Our method possesses several further beneficial features. No normalization and imputation of missing values is needed. Our model performs automatically unsupervised normalization, very similarly to ANOVA based methods (11), since the main factors are present in Equation 1; we incorporate explicitly more sources of variability, including scanning. Current normalization methods are often platform-dependent and based on hypotheses on the gene expressions difficult to test. Misuse of normalization is rather common in practice (31). The need for balanced designs, also for linear mixed effect models (11), often leads to discarding genes or requires imputation of missing values. Current methods for imputation fail if the missing mechanism is not at random or if the level of missing exceeds 20% (32). Our method does not impute missing values but can directly handle unbalanced datasets. Another characteristic of our method is that few constraints are imposed on the experimental design. The reference design is common because it allows in-house re-utilization of results. However, it requires stable reference samples, it leads to low statistical power, while the reference is uselessly measured many times (33). Our method allows for re-utilization of results without the need of a reference. Thus it opens for new possibilities of meta-analyses (34). Such analyses are currently built on top of statistical tests to detect differential expressions (35,36). Since the result of these tests may depend on experimental protocol and microarray platform, bias may lead to wrong conclusions. With our method, data from different studies can be combined at the basic level of transcript concentrations, regardless of whether studies use amplified or non-amplified starting material, cDNA or oligonucleotide platforms. By making more experimental information public available, such as spot intensities and standard deviations of each channel, scanner settings and measures of probe quantity and sequence length, all microarray data can be re-used in new investigations, leading to a better exploitation of the data and more precise results. In particular, our method may contribute to new insight into the regulation of pathways and be useful in the development of improved therapeutic strategies in which knowledge of the absolute concentrations is directly utilized. SUPPLEMENTARY DATA Supplementary data is available at NAR online. [Supplementary Material]
Acknowledgments We thank L. Holden, E. Hovig, M. Langaas, O. Myklebost, T. Speed, T. Stokke and B. Ylstra for useful discussions, V. Nygard for help with the software interface. Financial support was provided by The Norwegian Research Council (FUGE Bioinformatics; StAR; GeneStat), The Norwegian Microarray Consortium, Health Enterprise Rikshospitalet-Radiumhospitalet, The Norwegian Cancer Society and The Dutch BSIK/BRICKS Consortium. Conflict of interest statement. None declared. REFERENCES 1. Velculescu D.E., Zhang L., Vogelstein B., Kinzler K.W. Serial analysis of gene expression. Science. 1995;270:484–487. [PubMed] 2. Brenner S., Johnson M., Bridgham J., Golda G., Lloyd D.H., Johnason D., Luo S., McCurdy S., Foy M., Ewan M. Gene expression analysis by massive parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 2000;18:630–634. [PubMed] 3. Brown P., Botstein D. Exploring the new world of the genome with DNA microarrays. Nature Genet. 1999;21:33–37. [PubMed] 4. Lash A.E., Tolstoshev C.M., Wagner L., Schuler G.D., Strausberg R.L., Riggins G.J., Altschul S.F. SAGEmap: a public gene expression resource. Genome Res. 2000;10:1051–1060. [PubMed] 5. Polyak K.P., Riggins G.J. Gene discovery using the serial analysis of gene expression technique: implications for cancer research. J. Clin. Oncol. 2001;19:2948–2958. [PubMed] 6. Holloway A., vanLaar R., Tothill R., Bowtell D. Options available—from start to finish—for obtaining data from DNA microarrays II. Nature Genet. 2002;32:481–489. [PubMed] 7. Butte A. The use and analysis of microarray data. Nature Rev. Drug Discov. 2002;1:951–960. [PubMed] 8. Slonim D. From patterns to pathways: gene expression data analysis comes to age. Nature Genet. 2002;32:502–508. [PubMed] 9. Churchill G. Fundamentals of experimental design for cDNA microarrays. Nature Genet. 2002;32:490–495. [PubMed] 10. Quackenbush J. Microarray data normalisation and transformation. Nature Genet. 2002;32:496–501. [PubMed] 11. Kerr M., Martin M., Churchill G. Analysis of variance for gene expression microarray data. J. Comput. Biol. 2000;7:819–837. [PubMed] 12. Newton M., Kendziorsky C., Richmond C.e. On differential variability of expression ratios: improving statistical inference about gene expression changes from microarray data. J. Comput Biol. 2001;8:37–52. [PubMed] 13. Barth R., Gross K., Gremke L., Hastie N. Developmentally regulated mRNAs in mouse liver. Proc. Natl Acad. Sci. USA. 1982;79:500–5004. [PubMed] 14. Dudley A., Aach J., Steffen M.A., Church G.M. Measuring absolute expression with microarrays with calibrated reference sample and an extended signal intensity range. Proc. Natl Acad. Sci. USA. 2002;99:7554–7559. [PubMed] 15. Townsend J., Hartl D. Bayesian analysis of gene expression levels: statistical quantification of relative mRNA level across multiple strains or treatments. Genome Biol. 2002;3:research 0071.1–0071.16. 16. Held G., Grinstein G., Tu Y. Modeling of DNA microarray data by using physical properties of hybridization. Proc. Natl Acad. Sci. USA. 2003;100:7575–7580. [PubMed] 17. Hekstra D., Taussig A., Magnasco M., Naef F. Absolute mRNA concentrations from sequence-specific calibration of oligonucleotide arrays. Nucleic Acids Res. 2003;31:1962–1968. [PubMed] 18. Dror R., Murnick J., Rinaldi N., Marinescu V., Rifkin R., Young R. Bayesian estimation of transcript levels using a general model of array measurement noise. J. Comput. Biol. 2003;10:433–452. [PubMed] 19. Lyng H., Badiee A., Svendsrud D.H., Hovig E., Myklebost O., Stokke T. Profound influence of non-linearity in microarray scanners on gene expression ratios: analysis and procedure for correction. BMC Genomics. 2004;5:10. [PubMed] 20. Beaumont M., Rannala B. The Bayesian revolution in genetics. Nature Rev. Genet. 2004;5:251–261. [PubMed] 21. Peterson A., Heaton R., Georgiadis R. The effect of surface probe density on DNA hybridization. Nucleic Acids Res. 2001;29:5163–5168. [PubMed] 22. Mocellin S., Rossi C., Pilati P., Nitti D., Marincola F. Quantitative real-time PCR: a powerful ally in cancer research. Trends Mol. Med. 2003;9:185–195. 23. Wong Y., Selvanayagam Z., Wei N., Porter J., Vittal R., Hu R., Lin Y., Liao J., Shih J., Cheung T., et al. Expression genomics of cervical cancer: molecular classification and prediction of radiotherapy response by DNA microarray. Clin. Cancer Res. 2003;9:5486–5492. [PubMed] 24. Zeeberg B., Feng W., Wang G., Wang M., Fojo A.T., Sunshine M., Narasimhan S., Kane D., Reinhold W., Lababidi S., et al. Gominer: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 2003;4:R28. [PubMed] 25. Li C., Wong W. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl Acad. Sci. USA. 2001;98:31–36. [PubMed] 26. Ausubel F.M., Brent R., Kingston R.E., Moore D.D., Seidman J.G., Smith J.A., Struhl K. Current Protocols in Molecular Biology. Vol 1. Canada: John Wiley and Sons, Inc.; 2000. 27. Zhang L., Zhou W., Velculescu V.E., Kern S.E., Hruban R.H., Hamilton S.R., Vogelstein B., Kinzler K.W. Gene expression profiles in normal and cancer cells. Science. 1997;276:1268–1272. [PubMed] 28. Hastie N., Held W., Toole J. Multiple genes coding for the androgen-regulated major urinary proteins of the mouse. Cell. 1979;17:449–457. [PubMed] 29. Etienne W., Meyer M., Peppers J., Meyer R. Comparison of mRNA gene expression by RT–PCR and DNA microarray. Biotechniques. 2004;36:618–626. [PubMed] 30. Shih S., Robinson G., Perruzzi C., Calvo A., Desai K., Green J., Ali I., Smith L., Senger D. Molecular profiling of angiogenesis markers. Am. J. Pathol. 2002;161:35–41. [PubMed] 31. Yang Y., Dudoit S., Luu P., Lin D.M., Peng V., Ngai J., Speed T. Normalisation for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002;30:e15. [PubMed] 32. Troyanskaya O., Cantor M., Sherlock G., Brown P., Hastie T., Tibshirani R., Botstein D., Altman R. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001;17:520–525. [PubMed] 33. Townsend J. Multifactorial experimental design and the transitivity of ratios with spotted DNA microarrays. BMC Genomics. 2003;4:41. [PubMed] 34. Moreau Y., Aerts S., De Moor B., De Stooper B., Dabrowski M. Comparison and meta-analysis of microarray data: from the bench to the computer desk. Trends Genet. 2003;19:570–577. [PubMed] 35. Rhodes D.R., Barrette T.R., Rubin M.A., Ghosh D., Chinnaiyan A.M. Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway disregulation in prostate cancer. Cancer Res. 2002;62:4427–4433. [PubMed] 36. Choi J.K., Yu U., Kim S., Yoo O.J. Combining multiple microarray studies and modeling interstudy variation. Bioinformatics. 2003;19(Suppl. 1):i84–i90. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||
Science. 1995 Oct 20; 270(5235):484-7.
[Science. 1995]Nat Genet. 1999 Jan; 21(1 Suppl):33-7.
[Nat Genet. 1999]Genome Res. 2000 Jul; 10(7):1051-60.
[Genome Res. 2000]J Clin Oncol. 2001 Jun 1; 19(11):2948-58.
[J Clin Oncol. 2001]Nat Genet. 2002 Dec; 32 Suppl():481-9.
[Nat Genet. 2002]Nat Rev Drug Discov. 2002 Dec; 1(12):951-60.
[Nat Rev Drug Discov. 2002]Nat Genet. 2002 Dec; 32 Suppl():490-5.
[Nat Genet. 2002]Nat Genet. 2002 Dec; 32 Suppl():496-501.
[Nat Genet. 2002]J Comput Biol. 2001; 8(1):37-52.
[J Comput Biol. 2001]Proc Natl Acad Sci U S A. 1982 Jan; 79(2):500-4.
[Proc Natl Acad Sci U S A. 1982]Proc Natl Acad Sci U S A. 2002 May 28; 99(11):7554-9.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7575-80.
[Proc Natl Acad Sci U S A. 2003]Nucleic Acids Res. 2003 Apr 1; 31(7):1962-8.
[Nucleic Acids Res. 2003]J Comput Biol. 2003; 10(3-4):433-52.
[J Comput Biol. 2003]BMC Genomics. 2004 Feb 3; 5(1):10.
[BMC Genomics. 2004]Nat Rev Genet. 2004 Apr; 5(4):251-61.
[Nat Rev Genet. 2004]Nucleic Acids Res. 2001 Dec 15; 29(24):5163-8.
[Nucleic Acids Res. 2001]BMC Genomics. 2004 Feb 3; 5(1):10.
[BMC Genomics. 2004]BMC Genomics. 2004 Feb 3; 5(1):10.
[BMC Genomics. 2004]Proc Natl Acad Sci U S A. 2002 May 28; 99(11):7554-9.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7575-80.
[Proc Natl Acad Sci U S A. 2003]Nucleic Acids Res. 2003 Apr 1; 31(7):1962-8.
[Nucleic Acids Res. 2003]Clin Cancer Res. 2003 Nov 15; 9(15):5486-92.
[Clin Cancer Res. 2003]Genome Biol. 2003; 4(4):R28.
[Genome Biol. 2003]Nat Rev Drug Discov. 2002 Dec; 1(12):951-60.
[Nat Rev Drug Discov. 2002]Proc Natl Acad Sci U S A. 2002 May 28; 99(11):7554-9.
[Proc Natl Acad Sci U S A. 2002]Proc Natl Acad Sci U S A. 2003 Jun 24; 100(13):7575-80.
[Proc Natl Acad Sci U S A. 2003]Nucleic Acids Res. 2003 Apr 1; 31(7):1962-8.
[Nucleic Acids Res. 2003]Proc Natl Acad Sci U S A. 2001 Jan 2; 98(1):31-6.
[Proc Natl Acad Sci U S A. 2001]J Comput Biol. 2000; 7(6):819-37.
[J Comput Biol. 2000]Nucleic Acids Res. 2002 Feb 15; 30(4):e15.
[Nucleic Acids Res. 2002]Bioinformatics. 2001 Jun; 17(6):520-5.
[Bioinformatics. 2001]BMC Genomics. 2003 Oct 2; 4(1):41.
[BMC Genomics. 2003]Trends Genet. 2003 Oct; 19(10):570-7.
[Trends Genet. 2003]Cancer Res. 2002 Aug 1; 62(15):4427-33.
[Cancer Res. 2002]Bioinformatics. 2003; 19 Suppl 1():i84-90.
[Bioinformatics. 2003]