![]() | ![]() |
Formats:
|
||||||||||
Copyright © The Author 2005. Published by Oxford University Press. All rights reserved FAST DB: a website resource for the study of the expression regulation of human gene products INSERM U685/AVENIR, Centre G. Hayem, Institut Universitaire d'Hématologiem, Hôpital Saint Louis, 1 Avenue Claude Vellefaux, 75010 Paris, France *To whom correspondence should be addressed. Tel: +33 1 53 72 21 30; Fax: +33 1 42 40 95 57; Email: auboeuf/at/stlouis.inserm.fr Received May 18, 2005; Revised July 11, 2005; Accepted July 11, 2005. The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions/at/oupjournals.org This article has been cited by other articles in PMC.Abstract Human genes use various mechanisms to generate different transcripts having different exon content, which in turn generate multiple protein isoforms having differential and even opposite biological activities. To understand the biological consequences of gene transcriptional activity modulation, it is necessary to integrate the capability of genes to generate distinct functional products, particularly because transcriptional stimuli also affect the exon content of their target gene products. For this purpose, we have developed a bioinformatics suite, FAST DB, which defines easily and accurately the exon content of all known transcripts produced by human genes. In addition, several tools have been developed, including a graphical presentation of all gene products, a sequence multi-alignment of all gene transcripts and an in silico PCR computer program. The FAST DB interface also offers extensive links to website resources for promoter analysis and transcription factor binding site prediction, splicing regulatory sequence prediction, as well as 5′- and 3′-untranslated region analysis. FAST DB has been designed to facilitate studies that integrate transcriptional and post-transcriptional events to investigate the expression regulation of human gene products. INTRODUCTION About 95% of human genes contain exons (between 7 and 12 in average) separated by introns. Exons contain the information necessary for the production of proteins, whereas introns are removed during the splicing process that gives rise to messenger RNAs (mRNAs). The mRNAs are then exported to the cytosol where they are translated. Owing to the presence of exons separated by introns, a single gene can produce different mRNAs having various exon contents. At its 5′ end or within internal introns, a given gene can have different promoters driving the production of transcripts that have different 5′-untranslated regions (5′-UTRs) and that sometimes encode protein isoforms with different N-terminal domains (1,2). At its 3′ end or within internal exons and/or introns, a given gene can have different transcriptional termination sequences and/or polyadenylation sites allowing the production of different transcripts that have different 3′-UTRs and that eventually encode protein isoforms with different C-terminal domains (3–5). During the splicing process, different introns (or parts of introns) and different exons (or parts of exons) can be alternatively spliced (6–10). A given intron can be retained in an mRNA molecule (intron retention), whereas a given exon can be skipped (exon skipping or exon cassette). The 5′ or the 3′ end of a given intron can be differentially selected (alternative 5′- or 3′-splicing site, respectively), which modifies the size of the exons included in the mRNA. It is estimated that 75% of these events occur in the translated regions of mRNAs and have consequences at the protein level (6–9,11). Alternative splicing events either generate splice variants encoding truncated proteins by the introduction of a stop codon, or yield protein isoforms that contain different domains. This allows a single gene to produce proteins with different properties regarding their stability or cellular localization, their ability to be regulated by post-translational modifications and to respond to signaling pathways, and their ability to interact with partners and/or to perform enzymatic reactions (6–9). The biological importance of such mechanisms is illustrated by genes involved in cell death as a single gene can produce different protein isoforms with either pro- or anti-apoptotic effects (12). Moreover, the human sequencing project and the cloning and sequencing of an increasing number of human transcripts reveal that most human genes (between 40 and 70%) generate different transcripts, which contributes to increase the human proteome diversity encoded by a limited number of genes (6–9,13,14). Owing to the production of different translatable mRNAs from a given gene, it is not possible to predict the biological consequences resulting from gene transcriptional modulation only. This is particularly important because a transcriptional stimulus can ‘switch’ the promoter that drives the production of its gene products, and can also change the nature (exon content) of its target gene products. Indeed, the promoter identity driving the expression of a gene can affect the nature of the splice variants produced by this gene and, as we have shown, transcriptional stimuli, such as steroid hormones, simultaneously control the transcriptional rate of their target genes and the nature (exon content) of the spliced variant produced (15–18). Consistent with these observations, different transcription factors or transcriptional coregulators have different effects on splicing and 3′-end processing (17–23). In this context, studies of gene expression regulation need to account for the capability of human genes to produce different transcripts (24–26). For this reason, we developed a bioinformatics suite, named FAST DB (Friendly Alternative Splicing and Transcripts Database), that allows for defining easily and accurately the exon content of the different known transcripts produced by human genes based on a computerized analysis of human and mouse cDNAs and human expressed sequence tags (ESTs) libraries. In addition, a multi-alignment of all the transcript sequences of a given gene allows for visualizing the common and specific sequences of these transcripts. Therefore, it becomes very easy to design probes for downstream experimental applications, in particular PCR amplification. Thanks to FAST DB interface, users can design primers in a few minutes for PCR amplification of either all the gene products or specific variants, as well as for the co-amplification of splice variants giving rise to PCR products of different sizes. In addition, several links to various website resources are provided for the analysis of promoter regions and the analysis of 5′- and 3′-UTRs, as well as links to other splicing databases recently developed (27–32). Therefore, FAST DB is a bioinformatics tool designed for a rapid, extensive and accurate search to support integrated studies of gene transcriptional and post-transcriptional regulation. MATERIALS AND METHODS Filling FAST DB All data contained in FAST DB were obtained through an informatics analysis of sequences available from public libraries. FAST DB was developed in PERL v5.8.5 (www.perl.org) using Bio::EnsEMBL, CGI, DBI, BioPerl, GD and PDF::API2 modules (www.cpan.org) on an AMD Athlon64 3000+ processor with 1.5 Go of RAM and with the Mandrake 10.1 Linux distribution (www.mandrakelinux.com). The FAST DB algorithm recovered all the exon sequences defined in EnsEMBL version 26 (homo_sapiens_core_26_35 database) and each of these exons was ‘blasted’ against two cDNA databanks using standalone BLAST v2.2.10. A full-length transcript databank was downloaded from the UCSC website (genome.ucsc.edu) and a partial mRNA databank was downloaded from the NCBI website (www.ncbi.nlm.nih.gov). These two databanks were formatted using Formatdb (‘formatdb -i downloaded_databank -pF -oT -sT’). All the recovered transcripts with an E-value < 10−40 (Blast was made using Bio::ToolsRun::StandAloneBlast and transcript sequences were recovered using fastacmd) were aligned against genomic sequences using sim4. By parsing the sim4 output and using strict criteria (see Supplementary Material), each transcript was then selected or excluded. The sim4 output also allowed for defining the different exons contained in each transcript. These exons were called ‘transcript exons’ and were clustered by genomic position. ‘Genomic exons’ were defined using the more frequent first and last position of the different clustered ‘transcript exon’. To define alternative events generating the different products of a single gene, the FAST DB algorithm compared each ‘transcript exon’ with its corresponding ‘genomic exon’. FAST DB defined seven types of events (see Supplementary Material). To be defined as an alternative first exon, a ‘transcript exon’ had to be the first exon of at least one transcript and, if there were other internal exons at this genomic position, it had to start at least 10 nt upstream of the first position of the corresponding ‘genomic exon’. To be defined as an alternative last exon, a ‘transcript exon’ had to be the last exon of at least one transcript and, if there were other internal exons at this genomic position, it had to end at least 10 nt downstream of the last position of the corresponding ‘genomic exon’. The FAST DB algorithm also defined several splicing events. An ‘alternative 3′-splice site’ event was defined when the first position of a ‘transcript exon’ was different from the first position of the corresponding ‘genomic exon’. An ‘alternative 5′-splice site’ event was defined when the last position of a ‘transcript exon’ was different from the last position of the corresponding ‘genomic exon’. An ‘intron retention’ event was defined when a whole intronic sequence was included in at least one ‘transcript exon’ sequence. FAST DB also defined an ‘internal exon deletion’ (IED) event when a ‘transcript exon’ presented an internal sequence deletion compared with the corresponding ‘genomic exon’ sequence. Such deleted sequences correspond in fact to small introns that are frequently not spliced out (data not shown). Finally, an ‘exon skipping’ event was defined when at least one transcript had no defined ‘genomic exon’. Once the ‘genomic exons’ and the alternative events were defined, the FAST DB algorithm filled a MySQL database (using DBI module for PERL). FAST DB used MySQL v4.0.20 (www.mysql.com). To decrease the loading time of its website, the FAST DB algorithm makes PNG files of gene and transcript representations (using GD module for PERL) and PDF files (using PDF::API2 module for PERL) and stores these files on a server. The same algorithm was used for human ESTs and mouse cDNAs analyses. Multi-alignment and in silico PCR FAST DB multi-alignment is available by using the FAST DB graphical interface. This interface was created using PERL (CGI module) on an APACHE server v2.0.50 (www.apache.org). FAST DB multi-alignment was performed using Clustalw and Partial Order Alignment. For a given gene, the sequences of ‘transcript exons’ corresponding to one ‘genomic exon’ were aligned using Clustalw. The results for each ‘genomic exon’ position were then assembled to present the multi-alignment of the different cDNAs. ‘Transcript exons’ containing a retained intron were each divided into two exons and one intron (or more intron/exon in case of multiple consecutive retained introns). Each exon was aligned with its corresponding ‘genomic exons’, and then the intron sequence was inserted between both exons. In case of a ‘genomic exon’ presenting an IED event, all the corresponding ‘transcript exons’ were aligned using Partial Order Alignment with the global alignment option (33,34). Nevertheless, owing to multi-alignment programming difficulties, in particular when ‘transcript exon’ sequences poorly overlap, some ‘transcript exons’ are not properly aligned (see Supplementary Material). To overcome this limitation, the sim4 alignment of each transcript sequence against the genomic sequence is made available (see below). Housekeeping genes We have selected a set of 707 housekeeping (HK) genes by compiling results from several reports defining HK genes based on the presence of their transcripts in a wide variety of tissues (35–38). We have linked all these genes with the FAST DB database. We have then excluded redundant genes, have assigned a unique ID to each of these genes and have set a new table of 707 ID to do statistical analyses. RESULTS By comparing the sequence of human genes with the sequence of their transcripts available from public libraries, the FAST DB algorithm provides the genomic organization and the exon content of transcripts corresponding to >12 000 human genes (see Materials and Methods and Supplementary Material). FAST DB contains genes defined by ‘good quality’ cDNAs only. Genes missing in FAST DB might not be defined by human cDNAs or ‘good quality’ cDNAs (see Supplementary Material).More than 150 000 exons (10 exons per gene on average) have been defined in FAST DB using 80 000 transcripts (6 transcripts per gene on average). All the events defined by FAST DB and yielding different transcripts from single genes are based on a computerized analysis of known full-length and partial human cDNAs. For each gene, the analysis of the corresponding ESTs has been independently made available, which allows the prediction of additional events (see below). When possible, a similar analysis has been performed for mouse orthologous genes, allowing inter-species comparison (see below).The FAST DB ‘SEARCH PAGE’ (http://193.48.40.18/fastdb/) offers different ways of accessing a given gene. FAST DB can be queried using either any name of the gene (see Supplementary Material and Figure 1A
After running the search engine, a gene or a list of genes is provided as a query result. By clicking on the requested gene, FAST DB opens the ‘MAIN PAGE’ of the gene. For example, Figure 1B Analysis of the exon content of human gene products In addition to general information regarding a requested gene through links to different website resources, such as EnsEMBL, NCBI, ExPASy (Figure 1B As mentioned above, several mechanisms allow for differential selection of the exons (or part of them) that will be incorporated in the mature transcripts. For a requested gene, all the differentially selected exons defined by the FAST DB algorithm (see Materials and Methods and Supplementary Material) are listed at the bottom of the ‘MAIN PAGE’ (Figure 1B The computerized analysis of the exon content of transcripts present in public libraries is based on transcript selection (elimination of bad quality sequences) and on exon definition. Because using different criteria has consequences on the final analysis result, clicking on ‘ALTERNATIVE SPLICING’ in FAST DB provides links to other website resources that contain potential complementary information on transcript exon content (Figure 1B Because ESTs provide further information regarding differentially selected exons but are potentially of bad quality, an independent analysis of ESTs is available by clicking on ‘ANALYSIS OF HUMAN GENE WITH ESTs’ (Figure 1B In summary, FAST DB offers several ways of obtaining extensive information on the exons that are differentially selected within mature transcripts: analysis of full-length and partial human cDNAs and human ESTs, analysis of mouse cDNAs, links to other website resources and PUBMED, and analysis of transcripts entered by the users. Importantly, all the sequences used by FAST DB can be downloaded by clicking on the ‘DOWNLOAD SEQUENCES’ button (Figure 1B Graphical presentation of the transcripts and in silico PCR To identify the transcripts used to establish the different events listed on the ‘MAIN PAGE’, users access the graphical representation (exon content) of each transcript by clicking on the ‘TRANSCRIPTS VIEW’ button (Figure 1B
The ‘IN SILICO PCR’ link provides users with a multi-alignment (see Materials and Methods and Supplementary Material) of all transcript sequences of a given gene (Figure 2B Based on this multi-alignment, it becomes easy to design PCR primers. As shown in Figure 2B As described above, users can select primers that flank an alternative region of a gene to co-amplify different spliced variants and quantify the effect of a stimulus on the ratio of different spliced products. Primers can also be selected within sequences shared by all known transcripts to amplify all the gene products as a single PCR product and determine the impact of a transcriptional stimulus taking into account all the target gene products. Finally, primers can be selected to specifically amplify a variant by choosing primers within specific sequences of a splice variant. In addition, because the name of each exon appears on the multi-alignment, primers can be designed at the junction of exons to avoid amplification of genomic DNA. Another application of the multi-alignment interface is to predict whether an alternative splicing event would have biological consequences at the protein level. Users can select a specific variable sequence and analyze it with Blastx (http://www.ncbi.nlm.nih.gov/BLAST/) to test if this sequence encodes a known protein and with Interpro (http://www.ebi.ac.uk/interpro/) to test if this sequence encodes a specific protein domain. Analysis of HK genes Because HK genes are widely expressed across tissues, studies of gene expression regulation are often performed using HK genes as internal control (35–38). Interestingly, the genomic organization of HK genes has recently been shown to be different from that of tissue-specific (TS) expressed genes (37,38). HK genes are usually more ‘compact’ than TS genes, mostly because of the smaller size of their introns and because HK genes have fewer exons/introns than TS genes. Nevertheless, to our knowledge, there is no general information available regarding the potential ability of HK genes to generate multiple transcripts. Because HK genes are essential in transcriptional studies, we have set up an analysis of HK gene products in FAST DB. For this purpose, we used 707 HK genes that had been defined in previous reports based on their wide expression in many tissues (35–38). Analyzing all the genes (other that HK genes) present within FAST DB, we observed that 3458 genes (~28%) out of 12 538 analyzed genes contain at least two alternative first exons and 2414 genes (~19%) contain at least two different last exons (Figure 3
Using the set of 707 HK genes, we confirmed previous findings that HK genes contain fewer exons than other genes (two introns fewer in average) and that small (<10 kb) HK genes are more frequent compared with small TS genes (20% versus 10%, respectively). Nevertheless, we observed that HK genes can generate multiple transcripts similarly to TS genes (Figure 3 In conclusion, despite a different genomic organization, HK genes are able to generate transcript diversity at a similar level to that of TS genes. Therefore, caution is required in designing primers when HK genes are used as transcription internal controls because the exon content of HK gene products might vary depending on the biological conditions. To help users select primers and avoid alternative regions in these genes, the list of 707 HK genes is accessed by clicking the ‘List of housekeeping genes’ link on the FAST DB ‘SEARCH PAGE’ (Figure 1A DISCUSSION FAST DB has been designed to facilitate the study of the expression regulation of the various transcripts produced by human genes. This goal was achieved by: (i) a clear and ‘intuitive’ presentation of the information. (ii) The most complete set of information on the nature of the transcripts produced by human genes based on the analysis of full-length and partial cDNAs, as well as human ESTs. Links to other public databases that contain potential complementary information on the nature of transcripts produced by human genes are included. The possibility for users to enter their own transcript sequences and a PUBMED link enriches the information available. The analysis of mouse orthologous genes is provided for inter-species comparison. (iii) A sequence multi-alignment of all transcripts produced by a single gene, which facilitates the design of probes for downstream experiments. (iv) A link to a ‘List of housekeeping genes’ to help users design primers that are used as internal controls. The rationale is based on our observation that HK genes generate transcripts of different exonic content. (v) Links to website resources for promoter analysis and transcriptional factor binding site predictions, splicing regulatory sequence prediction, as well as for 5′- and 3′-UTR analysis, which facilitate studies integrating transcriptional and post-transcriptional aspects. Knowing the exon content of transcripts is required for understanding the biological consequences of transcriptional stimuli. Indeed, genes cannot be longer considered as ‘simple’ functional units. Genes are rather an ‘assemblage’ of exons that are differentially incorporated within the gene products that in turn generate protein isoforms with different biological activities or functions. The FAST DB analysis has been performed on 12 538 human genes defining 151 747 exons and we estimated that 31 318 exons are subject to regulation. This means that ~20% of total human exons are differentially integrated within gene products. This proportion is probably underestimated because the statistical analysis was performed using only full or partial cDNAs present within public libraries. This large amount of exons differentially incorporated within gene products is in good agreement with the poor definition of exon and intron boundaries, which creates the right conditions for physiological regulation and evolution (8,10,53–55).AVAILABILITY FAST DB is freely available on the internet at http://193.48.40.18/fastdb/. SUPPLEMENTARY MATERIAL Supplementary Material is available at NAR Online. [Supplementary Material]
Acknowledgments This work was supported by the INSERM ‘AVENIR’ program (D.A.), Société Française d'Hématologie and Association Française contre les Myopathies (P.G.), INSERM ‘Poste Vert’ (N.M.) and Ligue Nationale Contre le Cancer (M.D.). Funding to pay the Open Access publication charges for this article was provided by INSERM. Conflict of interest statement. None declared. REFERENCES 1. Landry J.R., Mager D.L., Wilhelm B.T. Complex controls: the role of alternative promoters in mammalian genomes. Trends Genet. 2003;19:640–648. [PubMed] 2. Zhang T., Haws P., Wu Q. Multiple variable first exons: a mechanism for cell- and tissue-specific gene regulation. Genome Res. 2004;14:79–89. [PubMed] 3. Zhang H., Hu J., Recce M., Tian B. PolyA_DB: a database for mammalian mRNA polyadenylation. Nucleic Acids Res. 2005;33:D116–D120. [PubMed] 4. Tian B., Hu J., Zhang H., Lutz C.S. A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res. 2005;33:201–212. [PubMed] 5. Beaudoing E., Gautheret D. Identification of alternate polyadenylation sites and analysis of their tissue distribution using EST data. Genome Res. 2001;11:1520–1526. [PubMed] 6. Stamm S., Ben-Ari S., Rafalska I., Tang Y., Zhang Z., Toiber D., Thanaraj T.A., Soreq H. Function of alternative splicing. Gene. 2005;344:1–20. [PubMed] 7. Kriventseva E.V., Koch I., Apweiler R., Vingron M., Bork P., Gelfand M.S., Sunyaev S. Increase of functional diversity by alternative splicing. Trends Genet. 2003;19:124–128. [PubMed] 8. Maniatis T., Tasic B. Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature. 2002;418:236–243. [PubMed] 9. Hastings M.L., Krainer A.R. Pre-mRNA splicing in the new millennium. Curr. Opin. Cell Biol. 2001;13:302–309. [PubMed] 10. Smith C.W., Valcarcel J. Alternative pre-mRNA splicing: the logic of combinatorial control. Trends Biochem. Sci. 2000;25:381–388. [PubMed] 11. Sorek R., Shamir R., Ast G. How prevalent is functional alternative splicing in the human genome? Trends Genet. 2004;20:68–71. [PubMed] 12. Wu J.Y., Tang H., Havlioglu N. Alternative pre-mRNA splicing and regulation of programmed cell death. Prog. Mol. Subcell. Biol. 2003;31:153–185. [PubMed] 13. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. [PubMed] 14. Grimwood J., Gordon L.A., Olsen A., Terry A., Schmutz J., Lamerdin J., Hellsten U., Goodstein D., Couronne O., Tran-Gyamfi M., et al. The DNA sequence and biology of human chromosome 19. Nature. 2004;428:529–535. [PubMed] 15. Cramer P., Pesce C.G., Baralle F.E., Kornblihtt A.R. Functional association between promoter structure and transcript alternative splicing. Proc. Natl Acad. Sci. USA. 1997;94:11456–11460. [PubMed] 16. Pagani F., Stuani C., Zuccato E., Kornblihtt A.R., Baralle F.E. Promoter architecture modulates CFTR exon 9 skipping. J. Biol. Chem. 2003;278:1511–1517. [PubMed] 17. Auboeuf D., Honig A., Berget S.M., O'Malley B.W. Coordinate regulation of transcription and splicing by steroid receptor coregulators. Science. 2002;298:416–419. [PubMed] 18. Auboeuf D., Dowhan D.H., Kang Y.K., Larkin K., Lee J.W., Berget S.M., O'Malley B.W. Differential recruitment of nuclear receptor coactivators may determine alternative RNA splice site choice in target genes. Proc. Natl Acad. Sci. USA. 2004;101:2270–2274. [PubMed] 19. Nogues G., Kadener S., Cramer P., Bentley D., Kornblihtt A.R. Transcriptional activators differ in their abilities to control alternative splicing. J. Biol. Chem. 2002;277:43110–43114. [PubMed] 20. Rosonina E., Bakowski M.A., McCracken S., Blencowe B.J. Transcriptional activators control splicing and 3′-end cleavage levels. J. Biol. Chem. 2003;278:43034–43040. [PubMed] 21. Auboeuf D., Dowhan D.H., Li X., Larkin K., Ko L., Berget S.M., O'Malley B.W. CoAA, a nuclear receptor coactivator protein at the interface of transcriptional coactivation and RNA splicing. Mol. Cell. Biol. 2004;24:442–453. [PubMed] 22. Monsalve M., Wu Z., Adelmant G., Puigserver P., Fan M., Spiegelman B.M. Direct coupling of transcription and mRNA processing through the thermogenic coactivator PGC-1. Mol. Cell. 2000;6:307–316. [PubMed] 23. Nagai K., Yamaguchi T., Takami T., Kawasumi A., Aizawa M., Masuda N., Shimizu M., Tominaga S., Ito T., Tsukamoto T., et al. SKIP modifies gene expression by affecting both transcription and splicing. Biochem. Biophys. Res. Commun. 2004;316:512–517. [PubMed] 24. Yeakley J.M., Fan J.B., Doucet D., Luo L., Wickham E., Ye Z., Chee M.S., Fu X.D. Profiling alternative splicing on fiber-optic arrays. Nat. Biotechnol. 2002;20:353–358. [PubMed] 25. Relogio A., Ben-Dov C., Baum M., Ruggiu M., Gemund C., Benes V., Darnell R.B., Valcarcel J. Alternative splicing microarrays reveal functional expression of neuron-specific regulators in Hodgkin lymphoma cells. J. Biol. Chem. 2005;280:4779–4784. [PubMed] 26. Pan Q., Shai O., Misquitta C., Zhang W., Saltzman A.L., Mohammad N., Babak T., Siu H., Hughes T.R., Morris Q.D., et al. Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Mol. Cell. 2004;16:929–941. [PubMed] 27. Huang H.D., Horng J.T., Lee C.C., Liu B.J. ProSplicer: a database of putative alternative splicing information derived from protein, mRNA and expressed sequence tag sequence data. Genome Biol. 2003;4:R29. [PubMed] 28. Lee C., Atanelov L., Modrek B., Xing Y. ASAP: the Alternative Splicing Annotation Project. Nucleic Acids Res. 2003;31:101–105. [PubMed] 29. Thanaraj T.A., Stamm S., Clark F., Riethoven J.J., Le Texier V., Muilu J. ASD: the Alternative Splicing Database. Nucleic Acids Res. 2004;32:D64–D69. [PubMed] 30. Zheng C.L., Nair T.M., Gribskov M., Kwon Y.S., Li H.R., Fu X.D. A database designed to computationally aid an experimental approach to alternative splicing. Pac. Symp. Biocomput. 2004:78–88. [PubMed] 31. Pospisil H., Herrmann A., Bortfeldt R.H., Reich J.G. EASED: Extended Alternatively Spliced EST Database. Nucleic Acids Res. 2004;32:D70–D74. [PubMed] 32. Huang H.D., Horng J.T., Lin F.M., Chang Y.C., Huang C.C. SpliceInfo: an information repository for mRNA alternative splicing in human genome. Nucleic Acids Res. 2005;33:D80–D85. [PubMed] 33. Lee C., Grasso C., Sharlow M.F. Multiple sequence alignment using partial order graphs. Bioinformatics. 2002;18:452–464. [PubMed] 34. Grasso C., Lee C. Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems. Bioinformatics. 2004;20:1546–1556. [PubMed] 35. Warrington J.A., Nair A., Mahadevappa M., Tsyganskaya M. Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. Physiol. Genomics. 2000;2:143–147. [PubMed] 36. Hsiao L.L., Dangond F., Yoshida T., Hong R., Jensen R.V., Misra J., Dillon W., Lee K.F., Clark K.E., Haverty P., et al. A compendium of gene expression in normal human tissues. Physiol. Genomics. 2001;7:97–104. [PubMed] 37. Eisenberg E., Levanon E.Y. Human housekeeping genes are compact. Trends Genet. 2003;19:362–365. [PubMed] 38. Vinogradov A.E. Compactness of human housekeeping genes: selection for economy or genomic design? Trends Genet. 2004;20:248–253. [PubMed] 39. Wingender E., Chen X., Hehl R., Karas H., Liebich I., Matys V., Meinhardt T., Pruss M., Reuter I., Schacherer F. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res. 2000;28:316–319. [PubMed] 40. Grabe N. AliBaba2: context specific identification of transcription factor binding sites. In Silico Biol. 2002;2:S1–S15. [PubMed] 41. Boardman P.E., Oliver S.G., Hubbard S.J. SiteSeer: visualisation and analysis of transcription factor binding sites in nucleotide sequences. Nucleic Acids Res. 2003;31:3572–3575. [PubMed] 42. Burden S., Lin Y.X., Zhang R. Improving promoter prediction for the NNPP2.2 algorithm: a case study using Escherichia coli DNA sequences. Bioinformatics. 2005;21:601–607. [PubMed] 43. Prestridge D.S. Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Biol. 1995;249:923–932. [PubMed] 44. Bajic V.B., Seah S.H., Chong A., Zhang G., Koh J.L., Brusic V. Dragon Promoter Finder: recognition of vertebrate RNA polymerase II promoters. Bioinformatics. 2002;18:198–199. [PubMed] 45. Jacobs G.H., Rackham O., Stockwell P.A., Tate W., Brown C.M. Transterm: a database of mRNAs and translational control elements. Nucleic Acids Res. 2002;30:310–311. [PubMed] 46. Mignone F., Grillo G., Licciulli F., Iacono M., Liuni S., Kersey P.J., Duarte J., Saccone C., Pesole G. UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs. Nucleic Acids Res. 2005;33:D141–D146. [PubMed] 47. Lambert A., Fontaine J.F., Legendre M., Leclerc F., Permal E., Major F., Putzer H., Delfour O., Michot B., Gautheret D. The ERPIN server: an interface to profile-based RNA motif identification. Nucleic Acids Res. 2004;32:W160–W165. [PubMed] 48. Tabaska J.E., Zhang M.Q. Detection of polyadenylation signals in human DNA sequences. Gene. 1999;231:77–86. [PubMed] 49. Cartegni L., Wang J., Zhu Z., Zhang M.Q., Krainer A.R. ESEfinder: a web resource to identify exonic splicing enhancers. Nucleic Acids Res. 2003;31:3568–3571. [PubMed] 50. Fairbrother W.G., Yeo G.W., Yeh R., Goldstein P., Mawson M., Sharp P.A., Burge C.B. RESCUE-ESE identifies candidate exonic splicing enhancers in vertebrate exons. Nucleic Acids Res. 2004;32:W187–W190. [PubMed] 51. Pertea M., Lin X., Salzberg S.L. GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res. 2001;29:1185–1190. [PubMed] 52. Reese M.G., Eeckman F.H., Kulp D., Haussler D. Improved splice site detection in Genie. J. Comput. Biol. 1997;4:311–323. [PubMed] 53. Ast G. How did alternative splicing evolve? Nature Rev. Genet. 2004;5:773–782. [PubMed] 54. Yeo G., Hoon S., Venkatesh B., Burge C.B. Variation in sequence and organization of splicing regulatory elements in vertebrate genes. Proc. Natl Acad. Sci. USA. 2004;101:15700–15705. [PubMed] 55. Yeo G., Holste D., Kreiman G., Burge C.B. Variation in alternative splicing across human tissues. Genome Biol. 2004;5:R74. [PubMed] |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||
Trends Genet. 2003 Nov; 19(11):640-8.
[Trends Genet. 2003]Genome Res. 2004 Jan; 14(1):79-89.
[Genome Res. 2004]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D116-20.
[Nucleic Acids Res. 2005]Genome Res. 2001 Sep; 11(9):1520-6.
[Genome Res. 2001]Gene. 2005 Jan 3; 344():1-20.
[Gene. 2005]Proc Natl Acad Sci U S A. 1997 Oct 14; 94(21):11456-60.
[Proc Natl Acad Sci U S A. 1997]Proc Natl Acad Sci U S A. 2004 Feb 24; 101(8):2270-4.
[Proc Natl Acad Sci U S A. 2004]Science. 2002 Oct 11; 298(5592):416-9.
[Science. 2002]Biochem Biophys Res Commun. 2004 Apr 2; 316(2):512-7.
[Biochem Biophys Res Commun. 2004]Nat Biotechnol. 2002 Apr; 20(4):353-8.
[Nat Biotechnol. 2002]Bioinformatics. 2002 Mar; 18(3):452-64.
[Bioinformatics. 2002]Bioinformatics. 2004 Jul 10; 20(10):1546-56.
[Bioinformatics. 2004]Physiol Genomics. 2000 Apr 27; 2(3):143-7.
[Physiol Genomics. 2000]Trends Genet. 2004 May; 20(5):248-53.
[Trends Genet. 2004]Nucleic Acids Res. 2000 Jan 1; 28(1):316-9.
[Nucleic Acids Res. 2000]Bioinformatics. 2002 Jan; 18(1):198-9.
[Bioinformatics. 2002]Nucleic Acids Res. 2002 Jan 1; 30(1):310-1.
[Nucleic Acids Res. 2002]Nucleic Acids Res. 2004 Jul 1; 32(Web Server issue):W160-5.
[Nucleic Acids Res. 2004]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D116-20.
[Nucleic Acids Res. 2005]Genome Biol. 2003; 4(4):R29.
[Genome Biol. 2003]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D80-5.
[Nucleic Acids Res. 2005]Nucleic Acids Res. 2003 Jul 1; 31(13):3568-71.
[Nucleic Acids Res. 2003]Nucleic Acids Res. 2004 Jul 1; 32(Web Server issue):W187-90.
[Nucleic Acids Res. 2004]Nucleic Acids Res. 2001 Mar 1; 29(5):1185-90.
[Nucleic Acids Res. 2001]Physiol Genomics. 2000 Apr 27; 2(3):143-7.
[Physiol Genomics. 2000]Trends Genet. 2004 May; 20(5):248-53.
[Trends Genet. 2004]Trends Genet. 2003 Jul; 19(7):362-5.
[Trends Genet. 2003]Nucleic Acids Res. 2005 Jan 1; 33(Database issue):D116-20.
[Nucleic Acids Res. 2005]Genome Res. 2001 Sep; 11(9):1520-6.
[Genome Res. 2001]Gene. 2005 Jan 3; 344():1-20.
[Gene. 2005]Curr Opin Cell Biol. 2001 Jun; 13(3):302-9.
[Curr Opin Cell Biol. 2001]Prog Mol Subcell Biol. 2003; 31():153-85.
[Prog Mol Subcell Biol. 2003]Nature. 2002 Jul 11; 418(6894):236-43.
[Nature. 2002]Trends Biochem Sci. 2000 Aug; 25(8):381-8.
[Trends Biochem Sci. 2000]Nat Rev Genet. 2004 Oct; 5(10):773-82.
[Nat Rev Genet. 2004]Genome Biol. 2004; 5(10):R74.
[Genome Biol. 2004]