![]() | ![]() |
Formats:
|
||||||||||||||||||||||||
Copyright © 2009, American Society of Plant Biologists Mapping Metabolic and Transcript Temporal Switches during Germination in Rice Highlights Specific Transcription Factors and the Role of RNA Instability in the Germination Process1[W][OA] Australian Research Council Centre of Excellence in Plant Energy Biology, University of Western Australia, Crawley, Western Australia 6009, Australia (K.A.H., R.N., A.C., A.I., A.H.M., J.W.); and Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476 Potsdam-Golm, Germany (M.L., B.U.) *Corresponding author; e-mail seamus/at/cyllene.uwa.edu.au. 2These authors contributed equally to the article. 3Present address: Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany. Received September 12, 2008; Accepted December 3, 2008. This article has been cited by other articles in PMC.Abstract Transcriptome and metabolite profiling of rice (Oryza sativa) embryo tissue during a detailed time course formed a foundation for examining transcriptional and posttranscriptional processes during germination. One hour after imbibition (HAI), independent of changes in transcript levels, rapid changes in metabolism occurred, including increases in hexose phosphates, tricarboxylic acid cycle intermediates, and γ-aminobutyric acid. Later changes in the metabolome, including those involved in carbohydrate, amino acid, and cell wall metabolism, appeared to be driven by increases in transcript levels, given that the large group (over 6,000 transcripts) observed to increase from 12 HAI were enriched in metabolic functional categories. Analysis of transcripts encoding proteins located in the organelles of primary metabolism revealed that for the mitochondrial gene set, a greater proportion of transcripts peaked early, at 1 or 3 HAI, compared with the plastid set, and notably, many of these transcripts encoded proteins involved in transport functions. One group of over 2,000 transcripts displayed a unique expression pattern beginning with low levels in dry seeds, followed by a peak in expression levels at 1 or 3 HAI, before markedly declining at later time points. This group was enriched in transcription factors and signal transduction components. A subset of these transiently expressed transcription factors were further interrogated across publicly available rice array data, indicating that some were only expressed during the germination process. Analysis of the 1-kb upstream regions of transcripts displaying similar changes in abundance identified a variety of common sequence motifs, potential binding sites for transcription factors. Additionally, newly synthesized transcripts peaking at 3 HAI displayed a significant enrichment of sequence elements in the 3′ untranslated region that have been previously associated with RNA instability. Overall, these analyses reveal that during rice germination, an immediate change in some metabolite levels is followed by a two-step, large-scale rearrangement of the transcriptome that is mediated by RNA synthesis and degradation and is accompanied by later changes in metabolite levels. Germination is a series of events that begins with imbibition, the uptake of water by the dry seed, followed by reinitiation of metabolic processes, elongation of the embryonic axis, and, by strict definition, terminates when part of the embryo emerges from the structures that surround it (Bewley, 1997). Germination can be divided into three phases; phases I and II are characterized by the rapid uptake of water and a plateau phase of water uptake, respectively. These phases represent a period of large metabolic change that primes the embryo to commence growth during phase III, when further uptake of water occurs (Bewley, 1997). Once the process of germination has commenced, utilization of stored reserves for energy production is necessary before the plant becomes autotrophic by establishing photosynthesis. The importance of energy metabolism in the early stages of seed germination can be seen in studies that inhibit germination, in phase II, by the use of various bioactive compounds or mutants. The alterations of transcript signatures or profiles in these studies reveal that many are associated with energy production and associated biosynthetic pathways (Carrera et al., 2007; Bassel et al., 2008). Historically, regulation of germination has been described by the antagonistic interaction of the phytohormones abscisic acid (ABA) and GA, whereby ABA represses germination and GA promotes germination (Bewley, 1997; Holdsworth et al., 2008a). However, evidence is growing for a role of auxin during this process as well as the interaction of other phytohormones such as ethylene and brassinosteroids (Holdsworth et al., 2008a). Inhibition of transcription and translation has differential effects on germination potential. It was shown over 40 years ago that transcription was not required for de novo protein synthesis in imbibed seeds, which suggested that endogenous mRNA was utilized in early stages of the germination process (Dure and Waters, 1965). Recent studies on seed germination have shown that as many as 12,000 mRNA molecules are present in mature seeds in Arabidopsis (Arabidopsis thaliana) and barley (Hordeum vulgare; Nakabayashi et al., 2005; Sreenivasulu et al., 2008), consistent with the role of preexisting mRNA molecules playing a central role in germination. While transcriptional inhibition slows the progression of germination, radicle protrusion still occurs, although subsequent seedling growth is prevented. In contrast, inhibition of translation completely inhibits germination (Rajjou et al., 2004). Although seed development and germination have been studied for several decades, recent advances in our understanding of these complex processes have largely resulted from the expansion of available sequence data and the establishment of large-scale -omics technologies. In particular, for the dicot model, Arabidopsis, a number of studies utilizing transcriptomic, proteomic, and metabolomic methods to investigate seed maturation, dormancy, and maturation have been published (Nakabayashi et al., 2005; Holdsworth et al., 2008a, 2008b), including one study that reports a correlation between transcript and metabolite data during the germination process (Fait et al., 2006). In comparison, there is a relative paucity of similar studies in monocots, particularly at the whole genome level, with respect to transcriptomic and metabolomic studies. While some transcriptome studies in wheat (Triticum aestivum) and barley have been performed (Watson and Henry, 2005; Wilson et al., 2005; Sreenivasulu et al., 2008), the lack of complete genome sequence data prevents comprehensive whole transcriptome analysis, including promoter analysis once coexpressed gene sets have been identified. For example, the most comprehensive transcriptome study in monocots to date, using barley (Sreenivasulu et al., 2008), reported that cis element searches were performed in homologous rice (Oryza sativa) promoters, as this sequence information is not yet available for barley. Also, time points sampled were 24 h after imbibition (HAI) or more apart (Sreenivasulu et al., 2008), meaning that early and potentially regulatory changes in the transcriptome have not yet been thoroughly investigated in monocots. Rice is an important food crop and is the first crop to have its genome sequenced, making it the model of choice for grass species. Several conditions established rice as the optimal choice for global germination analysis in monocots: (1) the availability of whole genome sequence information; (2) an established growth system for studying germination (Howell et al., 2006, 2007); (3) widespread functional annotation information; and (4) the availability of Affymetrix whole genome rice microarrays representing 51,279 transcripts. All of these factors together enabled comprehensive transcriptome analysis in rice over a germination time course and the direct link of coexpressed genes with upstream sequence information for identification of potential regulatory sequence elements. Sampling transcripts in dry seeds (0) and 1, 3, 12, and 24 HAI and of metabolites at 0, 1, 3, 6, 12, 24, and 48 HAI has allowed a detailed examination of germination in rice. Furthermore, this study enables the investigation of the roles of transcriptional and posttranscriptional processes and whether changes in transcript levels drive changes in metabolites during this essential phase of plant growth and establishment. RESULTS Transcriptome and Metabolite Profiling of Early Stages of Rice Germination We have previously characterized changes in water content and metabolic activity in rice embryos during germination up to 48 HAI and have observed the expected triphasic mode of water uptake with concomitant increases in oxygen uptake (Howell et al., 2006). Using this same experimental system, global changes in transcript levels were determined 0, 1, 3, 12, and 24 HAI using the Affymetrix Rice GeneChip, consisting of 57,381 probe sets representing 51,279 transcripts. Microarrays were performed in triplicate for each time point, and after normalization, analysis of the data revealed that the correlation between the replicates for each time point was greater than 0.98. The total number of probe sets for analysis was reduced by removing ambiguous probe sets and those that were not called “present” in at least two replicates at one time point, resulting in a final present set of 24,150 transcripts (Supplemental Table S1). Of these, over 17,000 transcripts were present prior to imbibition (i.e. representing the mRNA stored in the dry seed). Differential expression analysis (with false discovery rate correction and a stringent cutoff of P < 0.01) revealed that 76% (18,372) of these transcripts changed in abundance over the time course and 67% (16,487) changed relative to 0 HAI. (Supplemental Fig. S1; Supplemental Table S1). When successive time points are compared, there were relatively few changes in the first hour (59 up, four down), with most changes observed between 3 and 12 HAI (5,396 up, 4,935 down) and between 1 and 3 HAI (1,469 up, 1,276 down), while the number of changes between 12 and 24 HAI was considerably lower (420 up, 424 down; Fig. 1
Metabolite analysis was performed on the same samples used for microarray analysis and additional samples collected 6 and 48 HAI. A total of 126 unique metabolites were detected in the rice embryo samples, and of these, 66 could be identified based on matching to previously run standards (Supplemental Table S2A). Statistical analysis of metabolite abundance revealed that most (93%) of the 126 metabolites detected showed significant (P < 0.05) changes in abundance between at least two time points sampled during the time course, and of the 66 metabolites identified, all were found to show significant changes in abundance (Fig. 1 Comparing Patterns of Specific Metabolites and Transcripts during Germination The striking changes in metabolite levels that occurred just 1 HAI were predominantly associated with major carbohydrate metabolism (Fig. 2A
To compare these metabolite patterns with profiles observed for the 24,150 transcripts detected during rice germination, transcript abundance data were normalized to the highest value for each transcript and then hierarchically clustered, resulting in four main types of transcript profile patterns (Fig. 2B To understand the significance of these distinct patterns of transcript abundance and their relationship to the metabolome changes, three types of analysis were conducted that each provided a different insight into a molecular understanding of the germination process in rice. The first analysis was performed using the PageMan (Usadel et al., 2006) and MapMan (Thimm et al., 2004; Usadel et al., 2005) tools adapted for use with rice microarray data (see “Materials and Methods”). This type of analysis was performed by comparing only significant changes between successive time points and reveals which functional categories are significantly up- or down-regulated. PageMan analysis revealed that a variety of cellular processes were affected over the germination process and confirmed that the greatest number of significant changes were observed between 3 and 12 HAI (Fig. 3
The above analyses reveal the characteristics of statistically significant changes between successive time points. Thus, it primarily gives insights into the changes that are occurring in clusters 1 and 3, where large fold changes of many transcripts are occurring. It is not informative for sets of genes that do not change (i.e. cluster 4) and also may miss some changes that occur in clusters with smaller numbers of genes (i.e. cluster 2). Thus, a second analysis approach was carried out on changes in transcripts based on all transcript profiles (i.e. the 24,150-gene set; Fig. 2B
Third, sequential changes in metabolic organelle function (plastids, mitochondria, and peroxisomes) were investigated during germination. In order to determine if transcripts that encode organelle proteins changed in a coordinated manner compared with that observed for all transcripts (Fig. 2B
We have previously suggested a sequential assembly of mitochondria during germination based on the examination of a limited number of genes (Howell et al., 2006), and that sequential pattern is supported by the analysis of the larger set of genes in this study (Supplemental Table S5B). For example, transcript abundance of genes involved in protein import and organelle gene transcription (e.g. the mitochondrial RNA polymerase) is relatively high in dry seeds and at early germination stages and either remains high or declines (i.e. cluster 3 or 4). In contrast, many of the transcripts encoding components associated with organellar protein synthesis increase at 3 HAI, while transcripts encoding components of the TCA cycle and the respiratory chain increase at 12 HAI (i.e. clusters 1B and 1A, respectively). Notably, cluster 2 of the mitochondrial set includes transcripts encoding membrane transport proteins, including phosphate and oxoglutarate/malate carriers, Graves disease protein, a transporter necessary for the accumulation of mitochondrial coenzyme A (Prohl et al., 2001), and proteins annotated as uncoupling proteins. In Arabidopsis, these proteins have been functionally shown to transport a variety of metabolites, including the components of the malate/oxalocetate shuttle, which is an important link between mitochondrial and cytosolic metabolism (Palmieri et al., 2008). Finally, it was also seen that for some proteins that are encoded by small gene families and are involved in mitochondrial metabolism (e.g. the E1α subunit of the pyruvate dehydrogenase complex and cytochrome c), one isoform increased over the time period examined while another decreased (Supplemental Table S5B), suggesting that there may be a switch in the isoform utilized during seed maturation versus germination processes. In combination, these three analysis approaches revealed an almost immediate change in the metabolome, followed by a two-step large-scale rearrangement of the transcriptome featuring metabolic organelle biogenesis and followed by increases in amino acids and components involved in cell wall and carbohydrate metabolism. However, this analysis does not explain what the switch or driver was for these phases in the germination process. Transient Changes in the Transcriptome Indicate That 3 h May Represent a Specific Switch Point in the Germination Process The above analysis of overrepresented and underrepresented functional categories revealed that transcription factors are underrepresented in clusters 1 and 4 (i.e. transcript levels that increase or remain stable) but are overrepresented in clusters 2 and 3 (i.e. transcript levels that increase only transiently or decrease; Fig. 5 Given these interesting observations, we performed further analysis on the rice transcription factor set. A comprehensive list of rice transcription factors was collated from various databases and studies (as described in “Materials and Methods”), and it was found that transcripts for 1,786 of these were detected in at least one time point of this study. Their transcript profiles were analyzed by hierarchical clustering (Fig. 7A
In contrast, cluster 2, characterized by a transient peak in expression at 3 HAI before decreasing, was found to be enriched in AP2-EREBP and WRKY family members (Fig. 7B Transcription factors identified as belonging to cluster 2 (Fig. 7A
By searching for Arabidopsis homologs of the “germination-specific” transcription factors identified in this study and verifying their expression profiles using the eFP browser (Winter et al., 2007), we successfully identified two putative germination-specific transcription factors in Arabidopsis (Fig. 8B Transcripts Displaying Similar Profiles during Germination Share Common Sequence Motifs Over 17,000 transcripts were observed in dry seeds, and over the germination time course, more than 18,000 of the 24,150 transcripts present in total were found to significantly change in abundance. A number of peaks in transcript abundance were observed, at 1 and 3 HAI (cluster 2) and 12 HAI (cluster 1B), while some transcripts present in dry seeds were observed to decrease (cluster 3; Fig. 2B Five distinct core sequence elements were found to occur within the different groups (above), indicated by color (two related elements in purple), with variations or reverse complements shown (Supplemental Table S6C). Elements that occurred in 70% or more of the sequences from the sets above (Supplemental Table S6, B and C) were taken and searched in the larger genome sets according to expression criteria (i.e. peak expression at one time point and less than 50% at all other time points; Table I). Sequence elements in the 1-kb promoter region were found to be significantly enriched at all time points except 0 HAI (Table I). Transcripts that peaked at 24 HAI contained six elements that were significantly underrepresented and six that were overrepresented, and of these, three were unique to this time point (Table I; Supplemental Table S6D). Transcripts that peaked at 3 HAI had seven elements overrepresented, the greatest number of elements overrepresented in any group, and one element underrepresented (Supplemental Table S6D). Interestingly, two of the elements overrepresented at 3 HAI were underrepresented at 24 HAI.
When analyzing changes in transcript abundance, it is important to consider the role of mRNA degradation, particularly when it is evident that dramatic decreases in transcript abundance are occurring for large groups of transcripts after they peak in expression. In order to systematically investigate the role of mRNA decay during germination, 3′ UTRs were examined for enrichment of motifs in transcript subsets that showed peak expression at 3, 12, and 24 HAI. The presence of these predicted motifs (Supplemental Table S6C) and of 12 known RNA stability/instability-associated motifs was compared between the subsets and the “whole genome” set; however, this was somewhat restricted due to the fact that only 3,027 genes have an annotated 3′ UTR in rice (Supplemental Table S6D). Nevertheless, a clear picture emerged, in that four elements were only significantly enriched in 3 HAI, two of which have been associated previously with RNA instability in Arabidopsis (Narsai et al., 2007) and tobacco (Ohme-Takagi et al., 1993; Table I). Interestingly, one sequence element (GAATAA) was associated with stable RNA transcripts (Narsai et al., 2007) and was enriched in transcripts peaking at 12 HAI (Table I). The presence of these putative motifs in the 3′ UTR together with 1-kb upstream motifs suggests that the complex regulation of transcript abundance occurs at the levels of both transcription and degradation during the course of germination. DISCUSSION This study provides a comprehensive profile of the transcriptome and metabolites during germination in the monocot model rice. A series of temporal switches in metabolites and transcripts is suggested that results in a reactivation of cellular metabolism to support growth. At the earliest time point analyzed in this study, 1 HAI, there was a greater proportion of the detected metabolites than the detected transcripts changing in abundance relative to the total number of changes observed throughout the time course of this study. These early responses were then followed by the largest change in transcript abundances between 3 and 12 HAI, followed by relatively small changes in transcripts at subsequent time points. In contrast, changes in a large number of metabolites continued up to 48 HAI. This suggests that the early changes in metabolites arise from the activity of preexisting enzymes, as this occurs rapidly, possibly even before the energy-demanding process of translation has been fully activated to synthesize new proteins. However, the later changes in metabolites are more likely driven by transcription and translation, as they occur subsequent to changes in transcript abundance. Furthermore, the changes in transcript abundance that appear transitory in nature, defined in cluster 2, which are enriched in transcription factors but underrepresented in transcripts that encode proteins involved in metabolism, may represent a transition from the dormant state to an active growth state. The peak in transcripts in cluster 2 precedes the increase in abundance of approximately 8,000 transcripts (Fig. 2 A comparison of our rice data with barley seed germination also reveals similarities, with transcripts encoding components involved in sugar, starch, and lipid metabolism being up-regulated, followed by those involved in photorespiration and photosynthesis (Sreenivasulu et al., 2008). Increases in cell wall modification, β-oxidation, tetrapyrrole biosynthesis, amino acid synthesis, energy metabolism genes, and also fermentative components such as ADH were also seen during barley germination (Sreenivasulu et al., 2008). Upon imbibition, there is an immediate increase in hexose sugars and organic acids that is already statistically significant at 1 HAI (Fig. 2A A number of analyses of transcription factors revealed similarities with previous studies and give insights into the regulatory processes that occur during germination. Transcription factors preferentially expressed in the germinating embryo of barley, such as ARF, AUX/IAA, C2C2-GATA, and C3H-ARFs, were also observed here in rice. This study reveals a greater resolution of these events. Thus, for cluster 3, enriched in the PHD and HSF transcription factor families, and cluster 4, enriched in SET, it can be seen that these transcription factors are present in dry seeds and decrease or remain largely unchanged, respectively (Fig. 7, A and B The transcription factors in common with germination and anoxia profiles may also be significant, given that germinating seeds are thought to suffer from oxygen deficit (Bewley, 1997; Borisjuk et al., 2007). These peaked early (3 HAI) during germination (Fig. 8A Approximately 17,000 transcripts are stored in the dry rice embryo during seed development and maturation, compared with approximately 12,000 stored in both barley and Arabidopsis seeds (Nakabayashi et al., 2005; Sreenivasulu et al., 2008). This difference may simply be due to the number of probes represented on each array and the relative sizes of the genomes for each species. In an attempt to uncover insights into the regulatory mechanisms that cause changes in transcript abundance, the enrichment or depletion of sequence elements in the promoters or 3′ UTR was examined. Given that there may be over 2,000 transcription factors and hundreds of stability/instability elements, the prediction of such elements can be hard to interpret. Thus, our analysis was carried out in an attempt to determine if distinct regulatory steps were occurring during germination. Hence, we used stringent criteria with respect to sets of genes used to search for common sequence elements to reveal insights into the regulatory steps that may be occurring during germination. Even with this strict criteria at each time point, with the exception of 1 HAI, a unique enrichment or depletion of groups of elements was displayed, consistent with the combinatorial model of gene regulation and also the fact that a number of transcriptional steps or switches occur during germination. Additionally, the large enrichment of elements associated with RNA instability in the 3′ UTR of transcripts that peaked at 3 HAI indicates that RNA degradation also plays a central role in defining changes in transcript abundance during germination. Although RNA degradation has previously been proposed to “clean out” transcripts that are present in the mature seeds (Rajjou et al., 2004; Nakabayashi et al., 2005), it can be seen in this study that several groups of transcripts decrease in abundance during early germination, as shown in cluster 3 (transcripts that decreased at 0–3 HAI; Fig. 2 The combination of a specifically timed up-regulation of a suite of specific transcription factors and the degradation of both stored and early-induced mRNAs based on 3′ UTR sequences appear to be key elements in the coordination of at least some groups of transcripts during the early events in rice germination. These events appear to operate in a coordinated fashion with the induction of primary metabolic pathways, the biogenesis of organelles, and the establishment of the full metabolic profile in the germinating rice embryo. MATERIALS AND METHODS Rice Growth Dehulled, sterilized rice seeds (Oryza sativa ‘Amaroo’) were grown under aerobic conditions in the dark at 30°C as described previously (Howell et al., 2006). Embryos were rapidly dissected from the endosperm and snap frozen in liquid nitrogen. RNA Isolation, cDNA Synthesis, and Quantitative Reverse Transcription-PCR Total RNA was isolated from rice embryos as described previously (Howell et al., 2006). Three independent RNA preparations were used for each developmental stage/growth condition, and the concentration of RNA was determined spectrophotometrically. Microarray Analyses Transcriptomic analysis was performed using Affymetrix GeneChip Rice Genome Arrays (Affymetrix), and three biological replicates were analyzed for each time point. RNA quality was verified using an Agilent Bioanalyzer (Agilent Technologies) and spectrophotometric analysis (NanoDrop ND-1000; NanoDrop Technologies) to determine concentration and the A260-A280 and A260-A230 ratios. Preparation of labeled copy RNA from 2 to 3 μg of total RNA, target hybridization, as well as washing, staining, and scanning of the arrays were carried out exactly as described in the Affymetrix GeneChip Expression Analysis Technical Manual, using the Affymetrix One-Cycle Target Labeling and Control Reagents, an Affymetrix GeneChip Hybridization Oven 640, an Affymetrix Fluidics Station 450, and an Affymetrix GeneChip Scanner 3000 7G at the appropriate steps. Data quality was assessed using GCOS 1.4 (Affymetrix) before CEL files were imported into Avadis 4.3 (Strand Genomics) for further analysis. Raw intensity data were initially normalized using the MAS5 algorithm allowing probe identifications called present to be determined. Only those probe sets that were called present in at least two out of three replicates in at least one time point were included for further analysis. Ambiguous probe sets and bacterial controls were also removed, resulting in a final data set of 24,150-gene set. All microarray data have been deposited in the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress/) under the accession code E-MEXP-1766. Using the 24,150-gene set, probe intensities were analyzed using the GC-RMA algorithm and log transformed, and differential expression analysis was performed with P value correction (Benjamini and Hochberg, 1995) at the 0.01 level. This allowed the number of transcripts significantly changing to be calculated, which were then visualized on a heat map. For each of the 24,150 transcripts, the maximum expression was assigned a value of 1 and all other expression values were made relative to this, in order to carry out hierarchical clustering. Average linkage hierarchical clustering was carried out, and distinct clusters were uniquely colored for the genome (24,150), mitochondrial, chloroplast, peroxisome, and transcription factor sets. The differential expression analysis was carried out using Avadis 4.3 (Strand Genomics), while the heat maps and hierarchical clustering were all carried out using Partek Genomics suite software, version 6.3 (Partek). PageMan (Usadel et al., 2006) and MapMan (Thimm et al., 2004; Usadel et al., 2005) analyses were performed using a reduced set of unique probe sets (15,351). Of these, 9,098 were classified into nontrivial MapMan BINS based on the newly available rice mapping file, which was generated by a combination of automated searches in conjunction with minimal curation. In brief, rice protein sequences corresponding to the 15,351 probe sets were obtained from The Institute for Genomic Research (TIGR; version 5.0) and used for searches against five different databases: The Arabidopsis Information Resource (TAIR7) proteins (Swarbreck et al., 2008), SwissProt/Uniprot plant proteins (PPAP; Schneider et al., 2005), Conserved Domain Database (CDD; Marchler-Bauer et al., 2007), Clusters of Orthologous Groups (KOG; Tatusov et al., 2003), and InterProScan (Zdobnov and Apweiler, 2001). The programs used to perform the searches were BLASTP (Altschul et al., 1990) for TAIR7 and PPAP and RPSBLAST (Schaffer et al., 2001) for CDD and KOG. Database hits with bit scores lower than 50 were ignored as not significantly similar. The results of all searches were compiled into one table, and reference mappings of the above-listed databases were then used to assign preliminary MapMan BINcodes to each of the rice proteins. In the next step, the bit scores (in the case of TAIR7, PPAP, CDD, and KOG) for each database hit were recorded and evaluated for each rice protein as a measure of the reliability for the assignment of the protein into certain BINs To finally assign the protein to BINS, the bit scores of all database hits belonging to the same BIN were combined, allowing for multiple assigned BINcodes. In a subsequent step, the resulting BIN assignments were manually compared with the TIGR-based annotation and, in cases of ambiguity, checked against independent information available from gramene.org and the transcription factor database, resulting in more than 300 changes in assignments. Using this file, for both PageMan and MapMan, Wilcoxon rank sum tests with Benjamini-Hochberg false discovery rate control were used to determine statistically significant changes in specific BINS. Generation of Transcription Factor and Organelle Lists The transcription factor list was generated using three main sources: DRTF (Gao et al., 2006), RiceTFDB (Riano-Pachon et al., 2007), and Caldana et al. (2007). These lists were compiled, and all unique transcription factors were matched to the 24,150-gene set to generate a list of 1,786 transcription factors. To examine the transcripts encoding mitochondrial, chloroplast, and peroxisomal proteins, it was necessary to generate lists of transcripts known to encode proteins localized to these organelles. First, all large-scale experimental information to date on rice localization was gathered and the transcripts encoding these proteins were automatically assigned to that localization. To date, only a few large-scale localization studies have been carried out, so less than 300 could be assigned in this way. In order to overcome this, all protein sequence information was downloaded for the 24,150 genes, and four primary sources were employed: (1) experimentally shown localization based on protein work (Heazlewood et al., 2003; Howell et al., 2006, 2007; Kleffmann et al., 2007; Schwacke et al., 2007); (2) seven predictor programs: Predotar (Small et al., 2004), Subloc (Chen et al., 2006), TargetP (Emanuelsson et al., 2007), WoLF PSORT (Horton et al., 2007), PTS1 Predictor (Neuberger et al., 2003), PProwler (Boden and Hawkins, 2005; Hawkins and Boden, 2006), and ChloroP (Emanuelsson et al., 1999); (3) Gene Ontology (GO)/keyword information from four databases: Gramene GO cell comp, Affymetrix GO cell comp, TIGR GO cell comp, and TIGR keyword (Yuan et al., 2005); and (4) localization information from orthologous genes in Arabidopsis (Arabidopsis thaliana). When several sources were used in combination, in order for a protein to be assigned to a localization, the cutoffs for these sources were set as follows: (1) for experimentally shown localization, no cutoff was required; (2) at least four out of the seven predictors had to show the same localization; (3) at least two of the four GOs had to be annotated to the same localization; and (4) the transcript had to have at least 50% orthology to the Arabidopsis gene with known localization. Orthology information and GO cellular component information was retrieved from the Gramene database (Jaiswal et al., 2006). When a transcript was annotated to a particular localization, a “source number” was assigned to represent the source used to determine this localization. The source numbers were representative as follows: 1, localization based on experimental evidence; 2, two of the four primary sources agreed on localization (i.e. cutoffs were met in at least two primary sources); 3, three out of four primary sources agreed on localization; and 4, all four of the primary sources agreed on localization. For some transcripts, there was only information from one primary source; therefore, the cutoffs for some sources were raised to maintain stringency. Thus, transcripts with a source number between 7 and 9 represent transcripts for which there was only information from one of the four primary sources with numbers assigned as follows: 7, these transcripts had >70% identity with the orthologous gene in Arabidopsis with known localization (for peroxisomes, this cutoff was allowed to be lowered to >50%, as the prediction programs and other sources did not provide equivalent coverage for detecting peroxisomal genes); 8, for these transcripts, three of the four GO-related localization sources were annotated to be in the same localization (for peroxisomes, two out of four was sufficient); 9, at least four of the seven predictors agreed on localization. For peroxisomes, only one predictor was sufficient, as most of the prediction programs did not even have peroxisome as a choice of localization; therefore, the PTS1 Predictor default cutoff was deemed to be a sufficiently stringent. The source number 10 shows that none of the sources produced any conclusive organelle localization information, even at the lowered standards, while a source number of 11 indicates that one or more of the cutoff criteria were met but the localization based on these methods was conflicting between sources. Functional Annotation and Statistical Analysis For each probe set, the GO annotations and transcript assignment were as retrieved from Affymetrix. The National Science Foundation rice microarray database was used to match each Affymetrix probe identifier to a National Science Foundation accession identifier and to a TIGR locus identifier. These TIGR locus identifiers were then entered into the TIGR rice database, and the putative function of the encoded proteins was derived (Yuan et al., 2005). The Rice Annotation Project (RAP) database was also used to gather information about function, including the RAP description and RAP GO description. In order to gather this information, the files available from the RAP database were first used to convert each TIGR locus identifier to a RAP Os identifier. Lastly, in order to categorize the transcripts based on the FUNctional CATalogue (FUNCAT) of the encoded protein, the Australian National University genebins database was used for the whole genome set. Two FUNCATs were independently added: transcription factors, which was formed as a separate category based on DRTF (Gao et al., 2006), RiceTFDB (Riano-Pachon et al., 2007), and Caldana et al. (2007); and kinases, which was based on the rice kinase database (Dardick et al., 2007). For the organelle lists, the broad FUNCATs (Australian National University), the FUNCATs based on previously published data (Heazlewood et al., 2003), the FUNCAT of the orthologous gene in Arabidopsis, and manual annotation were used so that as many of the organelle genes as possible could be assigned a function. In order to compare the difference between the percentage of genes in a given FUNCAT within the genome set with the percentage of genes in that FUNCAT in a given cluster, z-score analysis was carried out to determine the significance of the difference between the two proportions, given that we know the sample sizes, frequency, and percentages for each set: The z-scores were then matched to the cumulative standard normal table, and the P values were determined. Public Rice Microarray Data Analysis and Comparison In order to examine transcript abundance changes across different tissues under different conditions and compare these with the germination transcript abundance profiles generated from this study, rice array data were retrieved from the Gene Expression Omnibus within the National Center for Biotechnology Information database. All data were MAS5.0 normalized and normalized against average ubiquitin expression for that array. These normalized array data were then compiled together, and for each probe set, the maximum expression was set to 1.0 with all other data relative to this. This normalization allowed cross-comparison of arrays from all of the different studies at once. The arrays analyzed included all of the arrays from this study, together with publicly available rice genome arrays carried out from different tissues/conditions, including 7-d-old seedlings that were untreated, drought stressed, salt stressed, or cold stressed (GSE6901; Jain et al., 2007); seeds collected at 5 d following pollination, 10-d-old embryos, 10-d-old endosperms, seedling roots, seedling shoots, unpollinated stigmas (at antithesis), ovaries (at antithesis), mature anthers, and suspension cells (GSE7951; Li et al., 2007); aerobically grown coleoptiles (4 d) and anoxically grown coleoptiles (4 d; GSE6908; Lasanthi-Kudahettige et al., 2007); crowns and growing points under salt stress and control conditions in sensitive and tolerant mutants in subspecies indica and japonica (GSE4438; Walia et al., 2007); crowns and growing points under control and salt stress conditions in subspecies indica and japonica (GDS1383; Walia et al., 2005); and leaves following biotic stress and control treatments (GSE7256; Ribot et al., 2008). Promoter Motif Analysis Following expression analysis, distinct groups of transcripts appeared that showed peak expression at single specific time points within the time course. In order to study these coexpressed transcripts more closely, all 1-kb upstream regions of the 24,150 transcripts were retrieved, and these upstream regions were examined for putative cis-acting elements. Programs designed to detect sequence elements generally have limits of less than 80 input sequences; thus, the list was distilled to uncover sequence elements that may be central to the regulatory processes that cause the changes in transcriptome observed. A “peak” was defined as a probe set having an expression value of 1.0 at that specific time point with expression levels of less than 0.5 at all other time points. Three main cis-element databases were used for this analysis. The first was the Rice Cis-Element Search database (Doi et al., 2008), which was used under default settings and searched for enrichment of known plant cis elements in the 1-kb upstream region. The second database used was the MEME Web server (Bailey et al., 2006), which was used under default settings with the length of the motif set to 6 to 8 bp and the number of motifs to find set to five (instead of the default of three). The MEME database could not process the large data sets, including the plastid and transcription factor 24-h peak subsets, so no output was generated for these. The third database used was the Regulatory Sequence Alignment Tool (Thomas-Chollier et al., 2008), which was used under default settings, with the only exception being that Markov modeling was selected for background calculation, as Oryza sativa was not available as a choice for the background model. The outputs from all of these databases are shown in Supplemental Table S6B. The lists of motifs from Supplemental Table S6B were then filtered only to include motifs present in more than 70% of all input sequences (Supplemental Table S6C), and the presence of these motifs was then examined in the whole genome and the genome “peaking” subsets, where a peak is as defined above. 3′ UTR Sequence Analysis The full genome 3′ UTR and 5′ UTR sequences are available from TIGR. This was downloaded and filtered to retain only the 3′ UTRs. However, this only added up to 3,027 UTRs available for the “whole genome.” Taking this small number into consideration, it was not feasible to look at the organelle-specific and transcription factor peaking subsets analyzed for the promoter regions, as these lists were too small. Thus, for the 3′ UTR, the genes peaking in expression at 0, 1, 3, 12, and 24 HAI in the entire genome set were analyzed; however, there were still too few in the 0- and 1-HAI peaking subsets, so these could not be analyzed (Table I). In order to look at the enrichment of motifs in an objective manner, only the MEME Web server was used, as we were not searching for known regulatory elements. The settings were set to search for five motifs that are 6 to 8 bp (default) in each of the subsets, and the outputs are shown at the bottom of Supplemental Table S6D. It is important to note that setting the output to be five motifs can result in false present calls for motifs that are not significant when the input list is small; therefore, only the significantly enriched motifs (present in 60%−70% of all input sequences) were included for further analysis (Supplemental Table S6, C and D). In addition to these putative predicted motifs, 12 motifs known to be associated with RNA stability/instability were examined for their presence in the genome (Table I; Supplemental Table S6D). Ten of these were motifs predicted to be associated with stability/instability of mRNA (Narsai et al. 2007), and two elements had previously been shown to be associated with RNA stability/instability (Newman et al., 1993; Ohme-Takagi et al., 1993). Metabolomic Analysis Data for the 126 nonredundant metabolites were analyzed by two-way differential comparisons to determine fold changes and associated P values, and the number of metabolites significantly changing were also visualized by heat map. The heat map showing the number of significantly changing metabolites was generated using Partek Genomics suite software, version 6.3. Extraction and Derivatization of Metabolites for Gas Chromatography-Mass Spectrometry Analysis Metabolites were extracted and derivatized using a method modified from that of Roessner-Tunali et al. (2003). To each tube containing 20 to 40 mg of frozen tissue powder was added 300 μL of cold (−20°C) Metabolite Extraction Medium (85% [v/v] HPLC-grade methanol [Sigma], 15% [v/v] untreated MilliQ water, and 100 ng μL−1 ribitol), and tubes were vortexed briefly and shaken at 1,400 rpm for 15 min at 70°C. Tubes were then centrifuged at 13,000g for 3 min to pellet insoluble material, and supernatant was reextracted with chloroform. Aliquots (100 μL) of the methanol fraction were dried under vacuum in 1.5-mL microfuge tubes. Dried extracts were methoximated by adding 20 μL of a 20 mg mL−1 solution of methoxyamine hydrochloride in anhydrous pyridine (Sigma) and incubating at 30°C for 90 min with shaking at 1,400 rpm. For trimethylsilylation, 30 μL of N-methyl-N-(trimethylsilyl)trifluoroacetamide (Sigma) was transferred to each tube, and tubes were incubated at 37°C for 30 min with 1,400 rpm shaking. Ten microliters of an n-alkane retention index calibration mixture (0.29% [v/v] n-dodecane, 0.29% [v/v] n-pentadecane, 0.29% [w/v] n-nonadecane, 0.29% [w/v] n-docosane, 0.29% [w/v] n-octacosane, 0.29% [w/v] n-dotracontane, and 0.29% [w/v] n-hexatriacontane dissolved in anhydrous pyridine) was then added to each tube, and reaction mixtures were vortexed and transferred to amber gas chromatography-mass spectrometry (GC-MS) vials with low-volume inserts and screw-top seals (Agilent Technologies) and allowed to rest for 4 h prior to beginning GC-MS analysis. GC-MS Instrumental Analysis Derivatized metabolite samples were analyzed on an Agilent GC/MSD system composed of an Agilent GC 6890N gas chromatograph (Agilent Technologies) fitted with a 7683B Automatic Liquid Sampler (Agilent Technologies) and 5975B Inert MSD quadrupole MS detector (Agilent Technologies). The gas chromatograph was fitted with a 0.25-mm (i.d.), 0.25-μm film thickness, 30-m Varian FactorFour VF-5ms capillary column with 10 m integrated guard column (Varian; product no. CP9013). GC-MS run conditions were essentially as described for GC-quadrupole-MS metabolite profiling on the Golm Metabolome Database Web site (http://csbdb.mpimp-golm.mpg.de/csbdb/gmd/analytic/gmd_meth.html; Kopka et al., 2005). Samples were injected into the split/splitless injector operating in splitless mode with an injection volume of 1 μL, purge flow of 50 mL min−1, purge time of 1 min, and a constant inlet temperature of 300°C. Helium carrier gas flow rate was held constant at 1 mL min−1. The GC column oven was held at the initial temperature of 70°C for 1 min before being increased to 76°C at 1°C min−1 and then to 325°C at 6°C min−1 before being held at 325°C for 10 min. Total run time was 58.5 min. Transfer line temperature was 300°C. MS source temperature was 230°C. Quadrupole temperature was 150°C. Electron-impact ionization energy was 70 eV, and the MS detector was operated in full scan mode in the mass-to-charge ratio range 40 to 600 with a scan rate of 2.6 Hz. The MSD was pretuned against perfluorotributylamine mass calibrant using the “atune.u” autotune method provided with the Agilent GC/MSD Productivity ChemStation software (version D.02.00 SP1; Agilent Technologies; product no. G1701DA). GC-MS Data Analysis Raw GC-MS data files in the proprietary ChemStation (.D) format were exported to generic NetCDF/AIA (.CDF) format with ChemStation GC/MSD Data Analysis software (Agilent Technologies). The NetCDF files produced were then processed using in-house MetaMiner software (A. Carroll and A.H. Millar, unpublished data) to carry out all peak detection, quantification, library matching, normalization, statistical analysis, and data visualization. Raw data processing in MetaMiner consisted of the following steps: retrieval of all extracted ion chromatograms (EICs), detection and integration of peaks in EICs, calculation of internally calibrated retention indices for all extracted peaks, matching of carefully selected analyte-specific EIC peaks to analytes in a custom mass spectral-retention index (MSRI) library of known and unknown metabolite derivatives (retention index error < 3 retention index units; Wagner et al., 2003; Schauer et al., 2005), and normalization of matched peak areas to the peak area of the internal standard, ribitol, and to fresh tissue weight of extracted samples. The MSRI library was constructed using publicly available AMDIS software (version 2.65) to extract MSRI information for authentic standard derivatives from standard runs and MSRI information for unknown analytes from representative analyses of complex biological extracts. In a few cases, certain analyte peaks were assigned a putatively known annotation based on matching to the Q_MSRI_ID MSRI library (version 2004-03-01) available from the Golm Metabolome Database (Kopka et al., 2005). In these cases, positive identification required a “weighted” mass spectral match score of greater than 90 and a retention index discrepancy of less than 2%. Unknown metabolite derivative peaks that could not be putatively identified by comparison with authentic standards or by matching against the Q_MSRI_ID library were annotated with a simple generic identifier with the syntax USH: name, match_score, where USH stands for “unknown spectral homolog,” name is the abbreviated name of top NIST02 mass spectral library match, and match_score is the “simple” match score reported by AMDIS. Artifact peaks and common contaminants were identified by analysis of negative control samples prepared in the same manner as biological samples but without the inclusion of tissue. Signals corresponding to these artifacts were not used in biological interpretation. Automatic statistical analysis of processed data was carried out by calculating, for each set of biological replicates, the mean signal intensity for each metabolite, and then, for each metabolite, dividing the mean signal in treated sample sets by the mean signal in control sample sets to calculate fold difference and testing the statistical significance (P < 0.05) of this difference by Student's t test. Supplemental Data The following materials are available in the online version of this article.
[Supplemental Data]
Acknowledgments We thank Ian Castleden from the Centre for Computational Systems Biology for help with the multiple localization predictions and sequence matching. Notes 1This work was supported by the Australian Research Council Centre of Excellence (grant no. CEO561495) and an Australian Research Council Australian Professorial Fellowship to A.H.M. The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: James Whelan (seamus/at/cyllene.uwa.edu.au). [W]The online version of this article contains Web-only data. [OA]Open Access articles can be viewed online without a subscription. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||
Plant Cell. 1997 Jul; 9(7):1055-1066.
[Plant Cell. 1997]Plant Physiol. 2007 Apr; 143(4):1669-79.
[Plant Physiol. 2007]Plant Physiol. 2008 May; 147(1):143-55.
[Plant Physiol. 2008]Plant Cell. 1997 Jul; 9(7):1055-1066.
[Plant Cell. 1997]New Phytol. 2008; 179(1):33-54.
[New Phytol. 2008]Science. 1965 Jan 22; 147():410-2.
[Science. 1965]Plant J. 2005 Mar; 41(5):697-709.
[Plant J. 2005]Plant Physiol. 2008 Apr; 146(4):1738-58.
[Plant Physiol. 2008]Plant J. 2005 Mar; 41(5):697-709.
[Plant J. 2005]New Phytol. 2008; 179(1):33-54.
[New Phytol. 2008]Trends Plant Sci. 2008 Jan; 13(1):7-13.
[Trends Plant Sci. 2008]Plant Physiol. 2006 Nov; 142(3):839-54.
[Plant Physiol. 2006]Funct Integr Genomics. 2005 Jul; 5(3):155-62.
[Funct Integr Genomics. 2005]Funct Integr Genomics. 2005 Jul; 5(3):144-54.
[Funct Integr Genomics. 2005]Plant Physiol. 2008 Apr; 146(4):1738-58.
[Plant Physiol. 2008]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]J Biol Chem. 2007 May 25; 282(21):15619-31.
[J Biol Chem. 2007]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]Plant Physiol. 2006 Nov; 142(3):839-54.
[Plant Physiol. 2006]BMC Bioinformatics. 2006 Dec 18; 7():535.
[BMC Bioinformatics. 2006]Plant J. 2004 Mar; 37(6):914-39.
[Plant J. 2004]Plant Physiol. 2005 Jul; 138(3):1195-204.
[Plant Physiol. 2005]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]Mol Cell Biol. 2001 Feb; 21(4):1089-97.
[Mol Cell Biol. 2001]Biochem J. 2008 Mar 15; 410(3):621-9.
[Biochem J. 2008]Plant Physiol. 2008 Apr; 146(4):1738-58.
[Plant Physiol. 2008]EMBO J. 2002 Dec 16; 21(24):6842-52.
[EMBO J. 2002]Curr Opin Struct Biol. 2003 Dec; 13(6):699-705.
[Curr Opin Struct Biol. 2003]Plant Cell. 2005 Aug; 17(8):2384-96.
[Plant Cell. 2005]Plant Physiol. 2008 Apr; 146(4):1738-58.
[Plant Physiol. 2008]J Genet Genomics. 2008 Feb; 35(2):105-18.
[J Genet Genomics. 2008]J Exp Bot. 2002 May; 53(371):1219-21.
[J Exp Bot. 2002]J Mol Biol. 2001 Dec 14; 314(5):1041-52.
[J Mol Biol. 2001]Nucleic Acids Res. 2008 Jan; 36(Database issue):D991-8.
[Nucleic Acids Res. 2008]PLoS One. 2007 Aug 8; 2(1):e718.
[PLoS One. 2007]PLoS One. 2007 Aug 8; 2(1):e718.
[PLoS One. 2007]Proc Natl Acad Sci U S A. 1993 Dec 15; 90(24):11811-5.
[Proc Natl Acad Sci U S A. 1993]Plant Cell. 2007 Nov; 19(11):3418-36.
[Plant Cell. 2007]Plant Cell. 2007 Nov; 19(11):3418-36.
[Plant Cell. 2007]Proc Natl Acad Sci U S A. 1993 Dec 15; 90(24):11811-5.
[Proc Natl Acad Sci U S A. 1993]Plant J. 2005 Mar; 41(5):697-709.
[Plant J. 2005]Plant Physiol. 2008 Apr; 146(4):1738-58.
[Plant Physiol. 2008]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]J Biol Chem. 2007 May 25; 282(21):15619-31.
[J Biol Chem. 2007]Plant Cell. 2005 Aug; 17(8):2384-96.
[Plant Cell. 2005]Plant Cell. 1997 Jul; 9(7):1055-1066.
[Plant Cell. 1997]New Phytol. 2007; 176(4):813-23.
[New Phytol. 2007]Trends Plant Sci. 2008 Jan; 13(1):14-9.
[Trends Plant Sci. 2008]Plant J. 2005 Mar; 41(5):697-709.
[Plant J. 2005]Plant Physiol. 2008 Apr; 146(4):1738-58.
[Plant Physiol. 2008]Plant Physiol. 2004 Apr; 134(4):1598-613.
[Plant Physiol. 2004]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]BMC Bioinformatics. 2006 Dec 18; 7():535.
[BMC Bioinformatics. 2006]Plant J. 2004 Mar; 37(6):914-39.
[Plant J. 2004]Plant Physiol. 2005 Jul; 138(3):1195-204.
[Plant Physiol. 2005]Nucleic Acids Res. 2008 Jan; 36(Database issue):D1009-14.
[Nucleic Acids Res. 2008]Plant Physiol. 2005 May; 138(1):59-66.
[Plant Physiol. 2005]Bioinformatics. 2006 May 15; 22(10):1286-7.
[Bioinformatics. 2006]BMC Bioinformatics. 2007 Feb 7; 8():42.
[BMC Bioinformatics. 2007]Plant Methods. 2007 Jun 8; 3():7.
[Plant Methods. 2007]Plant Physiol. 2003 May; 132(1):230-42.
[Plant Physiol. 2003]Plant Mol Biol. 2006 Jan; 60(2):201-23.
[Plant Mol Biol. 2006]Plant Physiol. 2005 May; 138(1):18-26.
[Plant Physiol. 2005]Bioinformatics. 2006 May 15; 22(10):1286-7.
[Bioinformatics. 2006]BMC Bioinformatics. 2007 Feb 7; 8():42.
[BMC Bioinformatics. 2007]Plant Methods. 2007 Jun 8; 3():7.
[Plant Methods. 2007]Plant Physiol. 2007 Feb; 143(2):579-86.
[Plant Physiol. 2007]Plant Physiol. 2007 Apr; 143(4):1467-83.
[Plant Physiol. 2007]Plant Physiol. 2007 Aug; 144(4):1797-812.
[Plant Physiol. 2007]Plant Physiol. 2007 May; 144(1):218-31.
[Plant Physiol. 2007]Plant Mol Biol. 2007 Mar; 63(5):609-23.
[Plant Mol Biol. 2007]Plant Physiol. 2005 Oct; 139(2):822-35.
[Plant Physiol. 2005]BMC Plant Biol. 2008 Feb 27; 8():20.
[BMC Plant Biol. 2008]Nucleic Acids Res. 2006 Jul 1; 34(Web Server issue):W369-73.
[Nucleic Acids Res. 2006]Nucleic Acids Res. 2008 Jul 1; 36(Web Server issue):W119-27.
[Nucleic Acids Res. 2008]Plant Cell. 2007 Nov; 19(11):3418-36.
[Plant Cell. 2007]Plant Cell. 1993 Jun; 5(6):701-14.
[Plant Cell. 1993]Proc Natl Acad Sci U S A. 1993 Dec 15; 90(24):11811-5.
[Proc Natl Acad Sci U S A. 1993]Plant Physiol. 2003 Sep; 133(1):84-99.
[Plant Physiol. 2003]Bioinformatics. 2005 Apr 15; 21(8):1635-8.
[Bioinformatics. 2005]Phytochemistry. 2003 Mar; 62(6):887-900.
[Phytochemistry. 2003]FEBS Lett. 2005 Feb 28; 579(6):1332-7.
[FEBS Lett. 2005]Bioinformatics. 2005 Apr 15; 21(8):1635-8.
[Bioinformatics. 2005]