• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of narLink to Publisher's site
Nucleic Acids Res. Aug 1, 1999; 27(15): 3219–3228.
PMCID: PMC148551

Intron-exon structures of eukaryotic model organisms.

Abstract

To investigate the distribution of intron-exon structures of eukaryotic genes, we have constructed a general exon database comprising all available intron-containing genes and exon databases from 10 eukaryotic model organisms: Homo sapiens, Mus musculus, Gallus gallus, Rattus norvegicus, Arabidopsis thaliana, Zea mays, Schizosaccharomyces pombe, Aspergillus, Caenorhabditis elegans and Drosophila. We purged redundant genes to avoid the possible bias brought about by redundancy in the databases. After discarding those questionable introns that do not contain correct splice sites, the final database contained 17 102 introns, 21 019 exons and 2903 independent or quasi-independent genes. On average, a eukaryotic gene contains 3.7 introns per kb protein coding region. The exon distribution peaks around 30-40 residues and most introns are 40-125 nt long. The variable intron-exon structures of the 10 model organisms reveal two interesting statistical phenomena, which cast light on some previous speculations. (i) Genome size seems to be correlated with total intron length per gene. For example, invertebrate introns are smaller than those of human genes, while yeast introns are shorter than invertebrate introns. However, this correlation is weak, suggesting that other factors besides genome size may also affect intron size. (ii) Introns smaller than 50 nt are significantly less frequent than longer introns, possibly resulting from a minimum intron size requirement for intron splicing.

Full Text

The Full Text of this article is available as a PDF (562K).

Selected References

These references are in PubMed. This may not be the complete list of references from this article.
  • Hawkins JD. A survey on intron and exon lengths. Nucleic Acids Res. 1988 Nov 11;16(21):9893–9908. [PMC free article] [PubMed]
  • Dorit RL, Schoenbach L, Gilbert W. How big is the universe of exons? Science. 1990 Dec 7;250(4986):1377–1382. [PubMed]
  • Palmer JD, Logsdon JM., Jr The recent origins of introns. Curr Opin Genet Dev. 1991 Dec;1(4):470–477. [PubMed]
  • Mount SM, Burks C, Hertz G, Stormo GD, White O, Fields C. Splicing signals in Drosophila: intron size, information content, and consensus sequences. Nucleic Acids Res. 1992 Aug 25;20(16):4255–4262. [PMC free article] [PubMed]
  • Fedorov A, Suboch G, Bujakov M, Fedorova L. Analysis of nonuniformity in intron phase distribution. Nucleic Acids Res. 1992 May 25;20(10):2553–2557. [PMC free article] [PubMed]
  • Das S, Yu L, Gaitatzes C, Rogers R, Freeman J, Bienkowska J, Adams RM, Smith TF, Lindelien J. Biology's new Rosetta stone. Nature. 1997 Jan 2;385(6611):29–30. [PubMed]
  • Long M, Rosenberg C, Gilbert W. Intron phase correlations and the evolution of the intron/exon structure of genes. Proc Natl Acad Sci U S A. 1995 Dec 19;92(26):12495–12499. [PMC free article] [PubMed]
  • Sharp PA, Burge CB. Classification of introns: U2-type or U12-type. Cell. 1997 Dec 26;91(7):875–879. [PubMed]
  • Pearson WR. Using the FASTA program to search protein and DNA sequence databases. Methods Mol Biol. 1994;24:307–331. [PubMed]
  • Boyce FM, Beggs AH, Feener C, Kunkel LM. Dystrophin is transcribed in brain from a distant upstream promoter. Proc Natl Acad Sci U S A. 1991 Feb 15;88(4):1276–1280. [PMC free article] [PubMed]
  • Gilson PR, McFadden GI. The miniaturized nuclear genome of eukaryotic endosymbiont contains genes that overlap, genes that are cotranscribed, and the smallest known spliceosomal introns. Proc Natl Acad Sci U S A. 1996 Jul 23;93(15):7737–7742. [PMC free article] [PubMed]
  • Russell CB, Fraga D, Hinrichsen RD. Extremely short 20-33 nucleotide introns are the standard length in Paramecium tetraurelia. Nucleic Acids Res. 1994 Apr 11;22(7):1221–1225. [PMC free article] [PubMed]
  • Christiano AM, Hoffman GG, Chung-Honet LC, Lee S, Cheng W, Uitto J, Greenspan DS. Structural organization of the human type VII collagen gene (COL7A1), composed of more exons than any previously characterized gene. Genomics. 1994 May 1;21(1):169–179. [PubMed]
  • Long M, de Souza SJ, Gilbert W. The yeast splice site revisited: new exon consensus from genomic analysis. Cell. 1997 Dec 12;91(6):739–740. [PubMed]
  • Long M, de Souza SJ, Rosenberg C, Gilbert W. Relationship between "proto-splice sites" and intron phases: evidence from dicodon analysis. Proc Natl Acad Sci U S A. 1998 Jan 6;95(1):219–223. [PMC free article] [PubMed]
  • Petrov DA, Hartl DL. High rate of DNA loss in the Drosophila melanogaster and Drosophila virilis species groups. Mol Biol Evol. 1998 Mar;15(3):293–302. [PubMed]
  • Petrov DA, Lozovskaya ER, Hartl DL. High intrinsic rate of DNA loss in Drosophila. Nature. 1996 Nov 28;384(6607):346–349. [PubMed]
  • Moriyama EN, Petrov DA, Hartl DL. Genome size and intron size in Drosophila. Mol Biol Evol. 1998 Jun;15(6):770–773. [PubMed]
  • Hughes AL, Hughes MK. Small genomes for better flyers. Nature. 1995 Oct 5;377(6548):391–391. [PubMed]
  • Goffeau A, Barrell BG, Bussey H, Davis RW, Dujon B, Feldmann H, Galibert F, Hoheisel JD, Jacq C, Johnston M, et al. Life with 6000 genes. Science. 1996 Oct 25;274(5287):546–567. [PubMed]
  • Baxendale S, Abdulla S, Elgar G, Buck D, Berks M, Micklem G, Durbin R, Bates G, Brenner S, Beck S. Comparative sequence analysis of the human and pufferfish Huntington's disease genes. Nat Genet. 1995 May;10(1):67–76. [PubMed]

Articles from Nucleic Acids Research are provided here courtesy of Oxford University Press

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...