• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Sep 12, 2006; 103(37): 13629–13634.
Published online Sep 1, 2006. doi:  10.1073/pnas.0601476103
PMCID: PMC1559406
Applied Biological Sciences

Accurately quantifying low-abundant targets amid similar sequences by revealing hidden correlations in oligonucleotide microarray data

Abstract

Microarrays have enabled the determination of how thousands of genes are expressed to coordinate function within single organisms. Yet applications to natural or engineered communities where different organisms interact to produce complex properties are hampered by theoretical and technological limitations. Here we describe a general method to accurately identify low-abundant targets in systems containing complex mixtures of homologous targets. We combined an analytical predictor of nonspecific probe–target interactions (cross-hybridization) with an optimization algorithm that iteratively deconvolutes true probe–target signal from raw signal affected by spurious contributions (cross-hybridization, noise, background, and unequal specific hybridization response). The method was capable of quantifying, with unprecedented specificity and accuracy, ribosomal RNA (rRNA) sequences in artificial and natural communities. Controlled experiments with spiked rRNA into artificial and natural communities demonstrated the accuracy of identification and quantitative behavior over different concentration ranges. Finally, we illustrated the power of this methodology for accurate detection of low-abundant targets in natural communities. We accurately identified Vibrio taxa in coastal marine samples at their natural concentrations (<0.05% of total bacteria), despite the high potential for cross-hybridization by hundreds of different coexisting rRNAs, suggesting this methodology should be expandable to any microarray platform and system requiring accurate identification of low-abundant targets amid pools of similar sequences.

Keywords: microbial ecology, optimization algorithm, cross-hybridization, free energy, rRNA

Biological systems typically contain families of homologous genes with varying degrees of sequence similarity. Because such related genes often have acquired new function or play roles in fine-tuning biological responses, it is important to accurately differentiate their expression. This is particularly difficult in multiorganism assemblages, such as microbial communities, where new approaches increasingly demonstrate that hundreds to thousands of genes with high sequence similarity can coexist (13). To understand the functional significance of such diversity or to use genes as markers for the occurrence of different organisms, it will be necessary to accurately detect specific targets in complex pools of similar sequences. Although microarrays are, in principle, well suited for this type of analysis, their intrinsic problems are exacerbated by this complexity (47), despite recent advances in experimental design (8).

In all microarray analysis, accuracy of specific signal detection is compromised by: (i) nonspecific interactions among probes and noncomplementary targets (cross-hybridization; refs. 9 and 10), (ii) stochastic variability in the measurements (noise; ref. 11), (iii) background level of the system (5), and (iv) unequal signal intensity among different complementary probe–target pairs for the same target abundance (unequal specific response; refs. 12 and 13). In single-organism expression analysis, some of these spurious contributions have been addressed by optimization of probe and array design (1417) and by hybridization signal analysis using several probabilistic and free-energy-based models (6, 13, 1820). These are primarily based on differential hybridization responses of short oligonucleotides (25-mer) differing by a single central mismatch. In this system, cross-hybridization is modeled as part of the noise/background, because low sequence identity among transcripts causes nonspecific interactions to be low. In addition, multiple probe pairs per target allow signal averaging and outlier rejection for improved signal quantitation. However, no current models fully meet the challenges presented by multiorganism assemblages, where large numbers of nontarget sequences with high similarity to the target can be present. In this situation, the specific target may be at such low abundance that the sum of even small levels of cross-hybridization in addition to noise and background may completely mask the specific signal. Moreover, no analytical tools are available, which can easily be transposed among different technological platforms.

We reasoned the above challenge would best be addressed by explicitly determining and simultaneously accounting for all spurious contributions, which obscure the underlying physical relationship between microarray signal intensity and target abundance. This led us to design an optimization algorithm based on a system of nonlinear equations, which express raw hybridization signal intensity through the true signal of a specific target and spurious signal contributions (Eq. 1; Materials and Methods). By iteratively solving these equations using best-linear-unbiased-estimation (BLUE) theory (Eq. 2; Materials and Methods), the algorithm determines the most likely values for all spurious contributions and the actual target abundances. These are initially inferred assuming a linear relationship between signal intensity and target abundance; however, because the signal saturates for higher target abundances as described by the Langmuir function (Eq. 3; Materials and Methods; see refs. 2123), this relationship can be used to accurately estimate target abundances over the entire range. Furthermore, the algorithm depends on a priori knowledge of the cross-hybridization probability between any probe–target pair, yet experimental measurement is intractable for the number of probe–target interactions possible on typical microarrays. We therefore developed an analytical predictor (Eq. 4; Materials and Methods) to estimate the probability of cross-hybridization between any probe–target pair differing by an arbitrary number and location of mismatches. The estimate is based on the average free energy of binding (ΔG―) of realizations of different probe–target interactions calculated from their sequence identity (Eq. 5; Materials and Methods). This probability is then used by the algorithm to calculate the cross-hybridization signal observed for each probe as a function of abundances of targets. We focused on 50-mer oligonucleotides to test the ability of the methodology to determine and compensate cross-hybridization between sequences differing by a range in number and location of mismatches. We then tested the method through a series of experiments aimed at differentiating ribosomal RNA (rRNA) of closely related bacteria of the genus Vibrio in both artificially assembled and natural coastal communities. This is a particular challenge, because (i) rRNAs are difficult to differentiate due to their high conservation (24, 25), and (ii) thousands of different rRNA genes have been shown to coexist within coastal communities, of which specific Vibrio taxa typically represent <0.1% (2, 26, 27).

Results and Discussion

We first validated the performance of the analytical cross-hybridization predictor against experimentally obtained data. We calculated all 19,321 possible cross-hybridization values for the 139 probes contained on our test microarray (Eqs. 3 and 4). The probes were 50 mers targeting 16S and 23S rRNA regions of model Vibrio species and reference bacteria, and ranged in sequence identity from 14 to 100%. When the calculated values were compared with a total of 1,233 experimentally determined cross-hybridization values, they showed good correlation with r2 = 0.996 and standard error <0.007 (Fig. 1). From these data, it is estimated that sequences 68–95% identical to a given probe will cross-hybridize to that probe with a probability of 0.01–0.7. Because the analytical predictor models cross-hybridization probability as a continuous function of sequence identity and is independent of target abundance, it gains in generality over previous approaches; these had experimentally established sequence identity cutoffs (75–87%), below which cross-hybridization can be considered negligible (2832). However, these cutoffs were experimentally determined for specific target identities and relative abundances, so they are not easily transposed. In addition, our model introduces average ΔG― (rather than ΔG) for the prediction of probability of cross-hybridization among oligonucleotides of essentially any length. Although the model performance should be validated for oligonucleotides of differing length, we predict it should be feasible, because ΔG has already been shown to correlate well with the hybridization intensity of both long (50- to 70-mer; refs. 30, 31, and 33) and short (18- to 25-mer; refs. 12, 20, and 34) oligonucleotides. Thus, the method should be easily transposable to other microarray platforms.

Fig. 1.
Performance of the analytical cross-hybridization predictor. (a) Comparison of cross-hybridization probability obtained by analytical predictor (Eq. 4) with experimental data obtained by spike-in microarray experiments where total RNA of five different ...

The ability of the methodology to accurately identify specific targets was tested in a series of three sets of experiments representing increasingly realistic and complex conditions. The first set of experiments consisted of spiking known amounts of Vibrio cholerae RNA into artificial communities at 0.05–1% of total RNA by varying the amount of both total and spiked RNA (Fig. 2). Duplicate and triplicate samples of artificial RNA communities containing 12.5, 1.25, and 0.5 μg of eukaryal RNA with 6.25, 1.25, and 0.25 ng of V. cholerae RNA (between 0.1% and 0.05% of the total RNA), were labeled and further hybridized (see Materials and Methods for details).

Fig. 2.
V. cholerae (V.C.) signal within artificial communities before (a–c) and after (d–f) application of the algorithm (Eqs. 1 and 2). Artificial RNA communities contained 12.5, 1.25, and 0.5 μg of eukaryal RNA with 6.25, 1.25, and ...

The artificial RNA community illustrates the dramatic effect of noise and even low cross-hybridization probability when the specific target is at low abundance (note: this is even with a background of eukaryal RNA) (Fig. 2 a–c). Although V. cholerae RNA present at high amounts (6.25 ng per array hybridization) was discernible even without application of the algorithm (Fig. 2a), cross-hybridization and noise completely masked the specific signal at lower amounts (1.25 and 0.25 ng per array hybridization), incorrectly suggesting the presence of additional bacteria (Fig. 2 b and c). Conversely, implementation of the algorithm (Eqs. 1 and 2) correctly predicted that V. cholerae was the only bacterium present even for the lowest spike (0.25 ng; Fig. 2 d–f). This is feasible, because the algorithm effectively identifies correlations in signal intensities for probes with various degrees of cross-hybridization to V. cholerae RNA to detect the presence of V. cholerae; the lack of such correlations for bacteria other than V. cholerae enables the algorithm to correctly predict the absence of other bacteria. This scenario is analogous to identification of hidden correlations in noisy data, which is a well known problem in biology (e.g., the analysis of circadian rhythm), medicine (e.g., electrocardiogram interpretation), astronomy, and stock market analysis.

The relationship between abundances returned by the algorithm, θalg, and true abundances, follows the Langmuir function (Eq. 3), where three regimes can be identified: linear regime for low-target abundance (0.25–20 ng), nonlinear regime (20–60 ng), and plateau range for high-target abundances when hybridization signal intensity saturates. Thus, an estimation of target abundance from microarray data, θest, can be obtained from θalg by solving the Langmuir equation. The accuracy of θest in estimating true target abundances depends on the regime it falls in. In the linear regime, the average relative error is 7.8% of the true abundance where the lowest RNA abundance of 0.25 ng (0.05% of total RNA) was detected with an accuracy of ±0.086 ng and relative error of 34.4%; in the nonlinear regime, the average relative error is 28.5%.

In a second set of experiments, the accuracy of identification of specific targets was further tested under near natural conditions by spiking known amounts of V. cholerae RNA at realistic concentrations into coastal water samples (2–150 ng of V. cholerae total RNA contributing 0.1–1% of total RNA). This tests the high probability of cross-hybridization, because at the same site, hundreds of bacterial rRNA variants including multiple Vibrio taxa other than V. cholerae have been detected (2, 27). In all cases, V. cholerae was correctly identified and followed the same relationship observed for the artificial communities (Fig. 3). Thus, the methodology overall returned quantitative results over at least a 250-fold range from 0.25 to 60 ng of RNA, above which the signal saturates; however, the small error of 0.086 ng for the lowest measurement suggests the detection limit can potentially be lower. This dynamic range correlates well with the 500-fold dynamic range measured for single organism expression studies using 60- and 70-mer microarrays (33, 35) and is >10-fold higher than described for environmental studies using both oligonucleotide and cDNA microarrays (29, 36, 37).

Fig. 3.
Quantification of spiked V. cholerae RNA in artificial and natural RNA communities (Eqs. 15). RNA abundances returned by the algorithm (Eqs. 1 and 2) show Langmuir dependence on true abundances (Eq. 3) with linear (0.25 ng ≤ true RNA ...

The implementation of the methodology enabled accurate signal detection near the principal physical limit of the array scanner used in our experiments. The lowest detectable RNA abundance of 0.25 ± 0.086 ng corresponds to a specific signal of 7 ± 2.6 CCD counts for our microarray instrumentation [ArrayWoRx scanner (Applied Precision); 1–60,000 CCD counts]. Thus, this methodology retrieved quantitative signal at the level approaching a single count, which is the physical limit of sensitivity and accuracy of any microarray scanner. Although the characteristics of other microarray instrumentation may differ, this suggests that the application of the algorithm can significantly improve the sensitivity obtained using other experimental platforms.

The last set of experiments was designed to test the ability of this methodology to differentiate closely related organisms within a real microbial community by evaluating the presence of several Vibrio (16S rRNA >95% sequence identity) for which the array contained probes. A total of seven taxa were identified in coastal seawater samples taken in June and August (Fig. 4). RNA of these taxa ranged from 1.80 to 60.07 pg/ml seawater representing between 0.005% and 0.15% of total community RNA with generally increasing representation during warmer water conditions. These observations are consistent with the relative abundance of the same Vibrio taxa measured by quantitative PCR (QPCR) at this and analogous sites (26, 27); we sought to further validate the quantification for Vibrio splendidus using reverse transcriptase QPCR (RT-QPCR). The estimates showed good agreement between microarray and RT-QPCR data with 10.12 ± 1.73 and 5.74 ± 0.08 and 3.28 ± 0.59 and 0.96 ± 0.12 copies of 16S rRNA (×106) per milliliter of seawater for June and August, respectively. These results therefore suggest that the new method extends both the sensitivity and accuracy of identification of low-abundant targets; previously reported detection limits were between 1% and 5% of the total RNA or DNA within communities (31, 32, 36, 38, 39). A recent publication by Brown and colleagues (8) describes an optimized experimental protocol, which includes probes designed to reduce cross-hybridization to a minimum; this enabled detection of individual bacteria species at <0.1% of total bacteria. Thus, we suggest that a combination of such optimized technical protocols with the analytical methodology presented here may even further improve sensitivity and accuracy of microarrays.

Fig. 4.
Identification of seven naturally occurring closely related Vibrio taxa (16S rRNA sequence identity >95%) in coastal seawater samples (Eqs. 15). Relative RNA abundances (percent of total RNA), ranging from 0.005% to 0.15%, and absolute ...

Overall, the performance of the combined methodology suggests it will be possible to extend quantitative microarray analysis to complex systems consisting of multiple organisms. Application of this methodology to microarrays consisting of thousands of probes will require iterative optimization involving matrices with size equal to the number of probes; this is well within the capability of commonly available computational resources. As in any microarray application, the precondition for this methodology is the availability of data on coexisting sequence diversity. However, several gene families have already been sufficiently screened by PCR amplification and sequencing of clones, and metagenomic approaches make such information increasingly available for the entirety of coexisting genomes (1, 40). Therefore, expanding the methodology presented here to these larger data sets will enable comprehensive studies of microbial community composition and responses on relevant spatial and temporal scales. Because the method allows quantitative signal recovery, such studies will be able to take into account shifts in the expression of similar genes in single organisms or relative abundance or activity of specific taxa within communities, for example, in the identification of covariance of specific human gut and dental microflora with diseased and healthy states (41, 42). Moreover, our data suggest the methodology provides sufficient sensitivity to measure rare sequence types within communities. This is evident from the accurate recovery of Vibrio taxa at abundances, which would require screening of thousands of clones in 16S rRNA gene libraries for positive identification. Such accurate identification of low-abundance sequence types was previously possible only by QPCR and indicates that microarrays will be applicable to a wide array of applications, which require target quantification.

Materials and Methods

Optimization Algorithm.

To relate experimentally observed hybridization intensities with true target abundances present in the sample, we made the following assumptions: (i) Hybridization signal intensity is a linear superposition of signals due to background, signal-specific intensity, and cross-hybridization; (ii) a perfect-match probe–pair signal may vary between different probes because of their unequal specific response; and (iii) the binding of various targets to any probe is independent and noncompetitive (20). The signal intensity of any given spot on the array is modeled with a nonlinear equation of the form

equation image

where for each array experiment, Yjp is the mean logarithm of observed hybridization signal intensity for a given target species j, probe region p, in the presence of background ν, and noise η. θ is the target abundance in the original sample, matrix βjk describes cross-hybridization between probe j and target k for each probe region p, and bjj is the signal intensity of a perfect match probe–target pair j normalized to the relative abundance of target j, and which can vary between different probes. Error term η is outside the logarithmic relationship to account for higher levels of noise intrinsically associated with higher signal intensities. The task of the inversion algorithm is to determine the abundance θ of a specific target gene from the logarithm of hybridization signal intensity Y for all its complementary probes, jp, by accounting for all four major spurious contributions of cross-hybridization (β), noise (η), background (ν), and unequal specific response (b). In the absence of cross-hybridization, the matrix β is diagonal, and the estimation of target abundance θ is relatively straightforward. In the presence of cross-hybridization, however, βjk is a nondiagonal matrix. In this case, coefficients β are calculated by using Eq. 4, and the algorithm iteratively determines the optimal values for bjjp, νjp, and θk which justify the observed hybridization intensities.

This optimization is accomplished through iterative recursive linearization using best-linear-unbiased-estimation (BLUE) theory (43). The iterative process can be described in matrix notations. If an nth iterative solution of Eq. 1 is available, the (n + 1)th iteration is found as follows:

equation image

where X is the vector of all unknowns (νjp, bjjp, θk)T,

equation image

with J the total number of species, P the total number of probe regions, index m varying from 1 to 2 JP +J, δjpm a unitary jp × jp matrix (i.e., δjpm = 1 for jp = m, 0 otherwise), and Σ and Ση are the prior covariance matrices of X(n) and η, respectively. To start the iterative process, the initial values of the unknowns b0, ν0, and θ0 were chosen as follows: (i) the signal intensity of a perfect-match probe-pair b0jj = constant, which in our experimental setup (ArrayWoRx scanner, Applied Precision CCD with counts from 1 to 60,000) was experimentally determined to be 30 ± 0.7 counts per nanogram of target [we chose b0jj = constant because the unequal specific response among different probe–target pairs (bjj) varied on average 20% among each other, as determined by array experiments containing 139 probe–target pairs where five bacteria were individually spiked]; (ii) the background ν0jp = mean observed signal intensity in case of zero true abundances, which is approximated as the measured intensity of 100 negative control spots in each microarray experiment; (iii) the rRNA abundance

equation image

and (iv) the prior covariance matrix Σ is diagonal with the variances of b, ν, and θ as diagonal elements. The optimal initial variances for b and θ were determined in computational experiments with synthetic data sets, which contained intensity data simulating the presence of different bacterial rRNA present at low levels in real experimental conditions of noise and background. The criterion to determine convergence of the algorithm was the relative difference between consecutive iterations <0.001%. Typically, 50 iterations were sufficient to reach convergence.

The relationship between RNA abundance of a particular target determined by the algorithm, θalg, and its true abundance, θtrue, is essentially linear for low-target abundances and approaches a plateau in the high limit of θtrue. This overall dependence follows the Langmuir function, which has been shown to accurately describe microarray signals (2123):

equation image

with θC = 20 ng. In this type of relationship, three regimes of dependence can be identified. θC, or critical abundance, provides the approximate upper bound for the regime within which θalg depends linearly on θtrue. For θtrue, between 20 and 60 ng (from θC to ≈3θC), θalg depends nonlinearly on θtrue, and Eq. 3 has to be solved to obtain θtrue from θalg. For even higher abundances (>3θC = 60 ng), the plateau regime is reached and θalg does not increase with θtrue.

Analytical Cross-Hybridization Predictor.

To estimate the probability of cross-hybridization between any given probe–target pair, we calculated the probability of dissociation of a probe j and target k pair having a given binding free-energy ΔGjk. The relationship between binding free energy and cross-hybridization has been proposed (30, 31, 33, 44). Here we build upon these results. The probability of a target bound to a probe and, thus, the probability of cross-hybridization βjk can be estimated using the law of mass action

equation image

where C0,jk and C1,jk are the abundances of the bound and free targets after the reaction, respectively, as follows:

equation image

where coefficient b measures the probability of a perfect match target–probe pair remaining hybridized after the reaction and coefficient h is the ratio of energy of binding and energy of breaking the target–probe pair. Both coefficients b and h depend on the conditions of the reaction (e.g., temperature and salt concentration). Coefficients b = 0.75 and h = 8.47 were obtained by fitting Eq. 4 to experimental data consisting of 1,233 cross-hybridization values (for an experimental posthybridization washing temperature of 60°C; Fig. 1).

The ΔGjkGjj ratios can be calculated by using either mfold software (www.bioinfo.rpi.edu/applications/mfold) or the analytical approximation described by Eq. 5. To model free binding energy, we used the following set of assumptions: (i) ΔGjk decreases with the number of base pair bonds; (ii) the presence of a mismatch in the sequence weakens the neighboring bonds; and (iii) in the case of a number of consecutive mismatches (e.g., “loop” formation), a target may bind the probe out of the sequence within the loop. Furthermore, we reasoned that a more accurate calculation of cross-hybridization requires ΔGjkGjj averaged over realizations of different probe–target pair interactions, ΔGjk―/ΔGjj, because most nonspecific binding interactions are characterized by a distribution of binding affinities due to multiple effects including conformational changes and variation in the point of attachment in the sequence (45, 46). Thus, the estimation of ΔGjk―/ΔGjj is reduced to the calculation of the expectation of the number of loops formed for a given sequence identity, and the ratio of the binding energies becomes:

equation image

where sequence identity is conventionally defined as

equation image

where N is the number of base pairs, pn and tn stand for probe and target base pairs in position n, and ε(pn, tn) = 1 if pn and tn are complimentary or 0 otherwise (www.ebi.ac.uk/clustalw). For T < 70°C, coefficient

equation image

and coefficient

equation image

For T = 37°C, γ = 0.83 and sc = 0.5, whereas for T = 60°C, γ = 1.26 and sc = 0.6. As shown in Fig. 1b, ΔGjkGjj ratios obtained by mfold are scattered around ΔGjk―/ΔGjj, calculated by using Eq. 5.

Detailed experimental procedures are available in Supporting Text, which is published as supporting information on the PNAS web site.

Supplementary Material

Supporting Text:

Acknowledgments

We thank Virginia Rich [Massachusetts Institute of Technology (MIT)], for printing the arrays and critical reading of the manuscript; the Biomicrocenter at MIT, in particular Sean Milton, for experimental design troubleshooting and scanning of slides; Hariharan Sunramanian (Northwestern University) for numerical experiments validating the analytical predictor; Valeriy Ivanov (MIT) for helpful discussions on the programming of algorithm; Dana Hunt (MIT) for filtering coastal sea water and counting cells in whole sea water; and Gaspar Taroncher-Oldenburg for critical reading of the manuscript. This study was supported by grants from the U.S. Department of Energy Genomes to Life Program, the National Institute on Environmental Health Sciences–National Science Foundation Woods Hole Center for Ocean and Human Health, and the National Oceanic and Atmospheric Administration (M.F.P.), and by National Science Foundation Grant BES-0238903, (to V.B.).

Abbreviations

rRNA
ribosomal RNA
QPCR
quantitative PCR

Footnotes

Conflict of interest statement: No conflicts declared.

This paper was submitted directly (Track II) to the PNAS office.

Data deposition: The probe sequences reported in this paper have been deposited in the National Center for Biotechnology Information (NCBI) Probe Database (PUIDs 6103259–6103399).

References

1. Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, et al. Science. 2004;304:66–74. [PubMed]
2. Acinas SG, Klepac-Ceraj V, Hunt DE, Pharino C, Ceraj I, Distel DL, Polz MF. Nature. 2004;430:551–554. [PubMed]
3. Gans J, Wolinsky M, Dunbar J. Science. 2005;309:1387–1390. [PubMed]
4. DeRisi JL, Iyer VR, Brown PO. Science. 1997;278:680–686. [PubMed]
5. Baldi P, Hatfield WH. DNA Microarrays and Gene Expression. Cambridge, UK: Cambridge Univ Press; 2002.
6. Wu Z, Irizarry RA. J Comput Biol. 2005;12:882–893. [PubMed]
7. Urisman A, Fischer KF, Chiu CY, Kistler AL, Beck S, Wang D, DeRisi JL. Genome Biol. 2005;6:R78. [PMC free article] [PubMed]
8. Palmer C, Bik EM, Eisen MB, Eckburg PB, Sana TR, Wolber PK, Relman DA, Brown PO. Nucleic Acids Res. 2006;34:e5. [PMC free article] [PubMed]
9. Siegmund KH, Steiner UE, Richert C. J Chem Inf Comput Sci. 2003;43:2153–2162. [PubMed]
10. Binder H, Preibisch S. Biophys J. 2005;89:337–352. [PMC free article] [PubMed]
11. Marshall E. Science. 2004;306:630–631. [PubMed]
12. Held GA, Grinstein G, Tu Y. Proc Natl Acad Sci USA. 2003;100:7575–7580. [PMC free article] [PubMed]
13. Li C, Wong WH. Proc Natl Acad Sci USA. 2001;98:31–36. [PMC free article] [PubMed]
14. Southern E, Mir K, Shchepinov M. Nat Genet. 1999;21:5–9. [PubMed]
15. Bowtell DE, Sambrook JE. DNA Microarrays: A Molecular Cloning Manual. Cold Spring Harbor, NY: Cold Spring Harbor Lab Press; 2002.
16. Rouillard JM, Herbert C, Zuker M. Bioinformatics. 2002;18:486–487. [PubMed]
17. Nielsen HB, Wernersson R, Knudsen S. Nucleic Acids Res. 2003;31:3491–3496. [PMC free article] [PubMed]
18. Huang JC, Morris QD, Hughes TR, Frey BJ. Bioinformatics. 2005;21(Suppl 1):i222–i231. [PubMed]
19. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP. Biostatistics. 2003;4:249–264. [PubMed]
20. Zhang L, Miles MF, Aldape KD. Nat Biotechnol. 2003;21:818–821. [PubMed]
21. Hekstra D, Taussig AR, Magnasco M, Naef F. Nucleic Acids Res. 2003;31:1962–1968. [PMC free article] [PubMed]
22. Dai H, Meyer M, Stepaniants S, Ziman M, Stoughton R. Nucleic Acids Res. 2002;30:e86. [PMC free article] [PubMed]
23. Binder H, Preibisch S, Kirsten T. Langmuir. 2005;21:9287–9302. [PubMed]
24. Pace NR, Olsen GJ, Woese CR. Cell. 1986;9:325–326. [PubMed]
25. Hugenholtz P, Goebel BM, Pace NR. J Bacteriol. 1998;180:4765–4774. [PMC free article] [PubMed]
26. Thompson JR, Pacocha S, Pharino C, Klepac-Ceraj V, Hunt DE, Benoit J, Sarma-Rupavtarm R, Distel DL, Polz MF. Science. 2005;307:1311–1313. [PubMed]
27. Thompson JR, Randa MA, Marcelino LA, Tomita-Mitchell A, Lim E, Polz MF. Appl Environ Microbiol. 2004;70:4103–4110. [PMC free article] [PubMed]
28. Kane MD, Jatkoe TA, Stumpf CR, Lu J, Thomas JD, Madore SJ. Nucleic Acids Res. 2000;28:4552–4557. [PMC free article] [PubMed]
29. Wu L, Thompson DK, Li G, Hurt RA, Tiedje JM, Zhou J. Appl Environ Microbiol. 2001;67:5780–5790. [PMC free article] [PubMed]
30. Taroncher-Oldenburg G, Griner EM, Francis CA, Ward BB. Appl Environ Microbiol. 2003;69:1159–1171. [PMC free article] [PubMed]
31. Denef VJ, Park J, Rodrigues JL, Tsoi TV, Hashsham SA, Tiedje JM. Environ Microbiol. 2003;5:933–943. [PubMed]
32. Rhee SK, Liu X, Wu L, Chong SC, Wan X, Zhou J. Appl Environ Microbiol. 2004;70:4303–4317. [PMC free article] [PubMed]
33. Hughes TR, Mao M, Jones A, Burchard J, Marton MJ, Schelter J, Meyer MR, Kobayashi S, Dai H, He Y, et al. Nat Biotechnol. 2001;19:342–347. [PubMed]
34. Loy A, Schulz C, Lucker S, Schopfer-Wendels A, Stoecker K, Baranyi C, Lehner A, Wagner M. Appl Environ Microbiol. 2005;71:1373–1386. [PMC free article] [PubMed]
35. Wang HY, Malek RL, Kwitek AE, Greene AS, Luu TV, Behbahani B, Frank B, Quackenbush J, Lee NH. Genome Biol. 2003;4:R5. [PMC free article] [PubMed]
36. Bodrossy L, Stralis-Pavese N, Murrell JC, Radajewski S, Weilharter A, Sessitsch A. Environ Microbiol. 2003;5:566–582. [PubMed]
37. Cho JC, Tiedje JM. Appl Environ Microbiol. 2002;68:1425–1430. [PMC free article] [PubMed]
38. Tiquia SM, Wu L, Passovets S, Xu D, Xu Y, Zhou J. BioTechniques. 2004;36:664–675. [PubMed]
39. Peplies J, Lau SC, Pernthaler J, Amann R, Glockner FO. Environ Microbiol. 2004;6:638–645. [PubMed]
40. Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM, Rokhsar DS, Banfield JF. Nature. 2004;428:37–43. [PubMed]
41. Relman DA. Nat Genet. 2002;30:131–133. [PubMed]
42. Corby PM, Lyons-Weiler J, Bretz WA, Hart TC, Aas A, Boumenna T, Goss J, Corby AL, Junior HM, Weyant RJ, et al. J Clin Microbiol. 2005;43:5753–5759. [PMC free article] [PubMed]
43. Kay SM. In: Prentice Hall Signal Processing Series. Oppenheim AV, editor. Vol 1. New York: Prentice–Hall; 1993. p. 589.
44. Wu C, Carta R, Zhang L. Nucleic Acids Res. 2005;33:e84. [PMC free article] [PubMed]
45. Peterson AW, Wolf LK, Georgiadis RM. J Am Chem Soc. 2002;124:14601–14607. [PubMed]
46. Vijayendran RA, Leckband DE. Anal Chem. 2001;73:471–480. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try

Formats:

Related citations in PubMed

See reviews...See all...

Links