Logo of genoresGenome ResearchCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Genome Res. 2002 Jul; 12(7): 1060–1067.
PMCID: PMC186627

Alu-Containing Exons are Alternatively Spliced


Alu repetitive elements are found in ~1.4 million copies in the human genome, comprising more than one-tenth of it. Numerous studies describe exonizations of Alu elements, that is, splicing-mediated insertions of parts of Alu sequences into mature mRNAs. To study the connection between the exonization of Alu elements and alternative splicing, we used a database of ESTs and cDNAs aligned to the human genome. We compiled two exon sets, one of 1176 alternatively spliced internal exons, and another of 4151 constitutively spliced internal exons. Sixty one alternatively spliced internal exons (5.2%) had a significant BLAST hit to an Alu sequence, but none of the constitutively spliced internal exons had such a hit. The vast majority (84%) of the Alu-containing exons that appeared within the coding region of mRNAs caused a frame-shift or a premature termination codon. Alu-containing exons were included in transcripts at lower frequencies than alternatively spliced exons that do not contain an Alu sequence. These results indicate that internal exons that contain an Alu sequence are predominantly, if not exclusively, alternatively spliced. Presumably, evolutionary events that cause a constitutive insertion of an Alu sequence into an mRNA are deleterious and selected against.

Alu elements are short interspersed elements (SINEs), typically 300 nucleotides long, which account for >10% of the human genome (International Human Genome Sequencing Consortium 2001; Li et al. 2001). Despite their being genetically functionless, Alu elements have been suggested to have broad evolutionary impacts (Mighell et al. 1997; Szmulewicz et al. 1998; Hamdi et al. 1999; International Human Genome Sequencing Consortium 2001). Alus are found in all primates (including prosimians), but in no other organism (Kapitonov and Jurka 1996; Schmid 1996). Therefore, it is tempting to suggest that they have played a role in the evolution of primates. However, the nature of this role is still under debate.

It has been shown in numerous studies that fragments of Alu sequences may appear in mature mRNAs, sometimes in the protein-coding region (Makalowski et al. 1994; Yulug et al. 1995; Nekrutenko and Li 2001). Some Alu insertions were found to be translated in vivo. For example, translated splice variants of the biliary glycoprotein containing an Alu fragment were identified by Western immunoblot analysis (Barnett et al. 1993). Another example is that of the human decay-acceleration factor (DAF), in which 10% of its transcripts contain an Alu fragment. There are indications that the Alu-containing DAF mRNA is translated to create a peptide that differs from the common DAF by a hydrophilic carboxy terminus, which inhibits the migration of DAF into the cell membrane (Caras et al. 1987).

A recent study reports that transposable elements are found in the protein-coding regions of ~4% of human genes, and that Alu elements account for about one-third of these insertions (Nekrutenko and Li 2001). Under the assumption of 30,000 genes in the human genome, there should be ~400 genes that contain fragments of Alu elements in their protein-coding regions. The insertion of an Alu sequence into a mature mRNA may cause a genetic disease, but an Alu insertion may also contribute to protein variability and versatility (Makalowski et al. 1994).

The vast majority of the insertions of Alu sequences into mature mRNAs are splicing mediated (Makalowski et al. 1994; Nekrutenko and Li 2001). This is possible because both strands of Alu sequences contain motifs that resemble consensus splice sites (Makalowski et al. 1994). Mutations within intronic Alu sequences may yield active splice sites, that is, part of the intronic Alu sequence will be exonized.

In theory, an insertion of an Alu sequence into a mature mRNA, especially if it is in the protein-coding region, should be deleterious to the organism. Therefore, there must be a mechanism that allows such a large number of Alu insertions into the human transcriptome, keeping it yet unharmed. Using genomically aligned cDNAs and ESTs, we scanned the genome to locate Alu-derived internal exons. We show that all Alu-derived exons found in our study are alternatively spliced. Thus, from an evolutionary point of view, exonized Alu sequences increase the coding and regulatory versatility of the transcriptome, and at the same time, maintain the intactness of the genomic repertoire.


To obtain the intron-exon structures of human genes, we used the output of the LEADS software platform (Shoshan et al. 2001) that was run on the December 2000 draft human genome, and the cDNAs and ESTs from GenBank version 121. The software cleans the expressed sequences from repeats, vectors, and immunoglobulins. It then aligns the expressed sequences to genome, taking alternative splicing into account and clusters overlapping expressed sequences into clusters that represent genes or partial genes (see Methods for a detailed description of the process).

Our search focused on internal exons, that is, exons that are flanked by at least one exon on the 5′ side and one on the 3′ side. We chose to work with internal exons because the prediction of terminal exons using EST alignments is problematic. We searched the LEADS output for cases of exon skipping, that is, internal exons that are skipped in some of the splice variants of a certain gene (alternatively spliced internal exons). We also created a set of constitutively spliced internal exons, for example, internal exons that are found in all detected splice variants of the gene. For these compilations, we first selected clusters containing four or more expressed sequences, in which at least one sequence was a cDNA (13,097 clusters). In this set of clusters, we searched for substructures of the cluster containing three exons separated by two introns. We took only those cases in which both introns agreed with the GT/AG, GC/AG, or AT/AC rules, and were not covered by expressed sequences. An internal exon was defined as an exon embedded between the two introns. An internal exon was classified as an alternative internal exon if there was at least one sequence that contained the three exons, and one sequence that contained both flanking exons, but skipped the middle one. A constitutive internal exon was defined as an internal exon supported by at least four sequences for which no alternative splicing was observed (Fig. (Fig.1).1). We limited our search to exons shorter than 400 bases, because the length of internal exons only rarely exceeds a few hundred bases (Deutsch and Long 1999).

Figure 1
Schematic representation of the multiple alignment of the mRNAs of a microsomal glutathione transferase homolog gene with the genomic sequence. Three GenBank mRNAs (blue) align to the same genomic locus on chromosome 9, ...

Under the rules defined above, we obtained 4151 constitutively spliced internal exons (coming from 1662 clusters) and 1176 alternatively spliced internal exons (coming from 1042 clusters). These sets represent, of course, only a fraction of the real number of internal exons in the genome. There are several reasons for not identifying all internal exons. First, a large number of ESTs that represent intron contamination align to places in the genome that are normally introns. Because we searched only for exons flanked by introns that are not covered by expressed sequences, we may have missed introns masked by the contaminated ESTs. Second, for the set of constitutively spliced internal exons, we chose only exons supported by four sequences or more, namely from relatively highly expressed genes. This condition may have led to the exclusion of exons from genes poorly represented in the EST database (dbEST). And finally, we searched only the subset of genes for which a cDNA sequence had been deposited in GenBank.

A BLASTn search of the alternatively spliced internal exons against the NCBI Alu database (Claverie and Makalowski 1994) yielded 61 exons (5.2%) hitting an Alu sequence with an E score lower than 10−10 (Table (Table1).1). These exons were declared Alu-containing exons. A second search of the database with the 4151 constitutive exons has failed to identify even one Alu-like sequence. These results indicate that internal exons that contain an Alu sequence are predominantly, if not exclusively, alternatively spliced.

Table 1
Features of Alu-Containing Alternatively Spliced Internal Exons

We further analyzed the Alu-containing exons to check their influence on the transcripts they are inserted into. As a reference set of exons, we used a set of 62 alternatively spliced internal exons compiled by Hide et al. (2001) from 52 genes on chromosome 22. In their study, Hide et al. (2001) used a rigorous in silico method to scan the annotated genomic sequence of chromosome 22 to identify alternatively spliced internal exons that are skipped in some of the transcripts. We took the set of exons from chromosome 22 as a set representing the normal population of alternatively spliced internal exons, and compared it with the set of Alu-containing exons we found.

Of our 61 Alu-containing alternatively spliced internal exons, 54 had an unambiguous coding-region annotation in the GenBank cDNAs. Of these, 45 (83%) were located within the protein-coding region and 9 (17%) within the 5′ untranslated region (UTR). No Alu-containing exons were found in the 3′ UTR. Although it is known that most expressed Alu sequences are found within the 3′ UTRs of mRNAs (Yulug et al. 1995), our finding is not surprising given that 3′ UTRs are mostly found in the terminal exon (Deutsch and Long 1999), whereas the exons in our study were internal ones. As seen in Figure Figure2,2, the distribution of Alu-containing exons along the mRNA was similar to the distribution of alternatively spliced internal exons from chromosome 22. The slight bias of Alu-containing exons toward being found in the 5′ UTR of the mRNA was not statistically significant.

Figure 2
Location of alternatively spliced internal exons within the mRNA. Data for 54 Alu-containing exons, for which there was noncontradictory information in the GenBank annotation, is presented in lighter shaded bars. Data of 62 alternatively spliced internal ...

However, the influence of the Alu-containing exons on the coding region of the protein is significantly different from the influence of the alternatively spliced internal exons from chromosome 22 (Fig. (Fig.3).3). In 38 cases (84%) of 45 Alu-containing exons that are located within the protein-coding region, the insertion of an Alu-containing exon results in a shortened protein, either through frameshift (30 cases, 66.6%) or through an in-frame stop codon within the Alu-containing exon itself (8 cases, 17.8%). In comparison, only 21 alternatively spliced internal exons (44%) from chromosome 22 set yielded a premature termination, 18 of them (38%) cause frameshift.

Figure 3
Effect of exon insertion on the protein-coding region. Data for 45 Alu-containing exons occurring within the protein-coding region are presented in lighter shaded bars. Data of 48 alternatively spliced internal exons from chromosome 22 (Hide et al. 2001 ...

Only 6 (13%) Alu-containing exons neither contain stop codons nor affect the original termination codon. These exons can, therefore, be regarded as genuine domain donors. The lengths of these domains range between 15 and 42 amino acids, and their predicted isoelectric points vary from 3.4 to 11. The set of alternatively spliced internal exons from chromosome 22 behaves differently — 22 of the exons (46%) in this set are domain donors.

We suggest measuring the strength of the splice sites of an alternatively spliced internal exon by means of a retention ratio, which is calculated as the number of mRNA sequences that contain the alternatively spliced exon divided by the total number of mRNA sequences. In practice, the retention ratio for a gene or a locus was calculated as the observed number of expressed sequences that contain the alternatively spliced exon as well as the two flanking exons divided by the total number of expressed sequences aligned to the locus (see Table Table11 for the number of sequences that confirmed each exon or skipped it). Most Alu-containing exons have a small retention ratio (average of 0.21), that is, they are only found in about one-fifth of all mRNA transcripts. This value is, of course, overestimated, because by necessity we took only loci in which there was at least one sequence showing an alternative internal exon. Loci with a small number of covering expressed sequences bias the ratio upward. Thus, the retention ratio for the 31 cases, in which the number of sequences is 10 or above, averages in 0.11 (Fig. (Fig.4).4). In comparison, the average retention ratio of the 1115 alternatively spliced internal exons that do not contain Alu sequences is 0.41 (data not shown).

Figure 4
Retention ratios of highly covered Alu-containing exons. Retention ratio for each exon was calculated by the number of expressed sequences that contain the exon as well as both flanking exons, divided by the total number of sequences that contain both ...

Following the convention in the literature, we define the poly(A)-containing Alu sequence as the plus strand and the complementary poly(T)-containing sequence as the minus strand. A total of 52 of the 61 Alu-containing exons (85%) involve the minus strand. The uneven distribution between the strands is probably due to the fact that the minus strand of the Alu consensus sequence contains more motifs that resemble splice sites than the plus strand (Makalowski et al.1994; Makalowski 2000). Table Table33 enumerates the splice sites utilized by the Alu-containing exons and the location of these splice sites along the consensus Alu sequence. There were seven sites in the minus strand of the Alu sequence that were utilized as 5′ splice sites (donors), of which three had not been reported previously (Makalowski 2000). Twelve sites in the minus strand of the Alu sequence were utilized as 3′ splice sites (acceptors); all but one were not reported previously (Makalowski 2000). In the plus strand, we identified a single potential acceptor site and three potential donor sites — one of these was identified previously (Makalowski 2000).

Table 3
Potential Splice Sites in the Alu Consensus Sequence that are Utilized by Alu-Containing Exons

It has been proposed that Alu evolution proceeds through successive waves of fixation, in which each Alu subfamily is derived from a small number of source sequences belonging to an evolutionarily older subfamily (Jurka and Milosavljevic 1991; Batzer et al. 1996; Kapitonov and Jurka 1996). Key nucleotide positions are distinctive between Alu subfamilies (Jurka and Milosavljevic 1991; Batzer et al. 1996). We used a collection of 153,645 annotated Alu elements mapped to the human genome (Stenger et al. 2001) to determine the frequency of each Alu subfamily in the human genome. RepeatMasker (http://repeatmasker.genome.washington.edu/cgi-bin/RepeatMasker) was run on the DNA around each Alu-containing exon to determine the borders of the Alu in the DNA and the subfamily type. We found that older subfamilies (such as Alu-J and Alu monomers) are significantly over-represented (P < 6.4 × 10−21) in the Alu-containing exons, whereas newer subfamilies (Alu-S and Alu-Y) are under-represented (Table (Table2).2).

Table 2
Distributions of Alu Subfamilies within the Genome and Alu-Containing Exonsa

The average length of an Alu-containing exon was 114 bases, with the longest exon being 286 bases, and the shortest 42 bases. As a typical Alu element contains 300 bases, the exons contain only a fraction of the Alu sequence. We used RepeatMasker to determine the borders of the Alu element on the genome. All Alu elements found within exons were extending into at least one of the flanking introns. We found no case of an Alu element totally contained within an exon, but this might be due to the fact that we limited our search to exons shorter than 400 bases, and an insertion of a full Alu element into an exon would result in a very long exon. We note that full-length Alu elements have been found previously in terminal exons. However, our study excluded terminal exons. These results indicate that all 61 Alu-containing exons found in our set resulted from exonization of part of an intronic Alu element, rather than directly inserted into pre-existing exons.


From our results, it is clear that constitutive Alu-containing internal exons are either absent or very rare in the human transcriptome, whereas alternative Alu-containing internal exons appear frequently. Additionally, Alu-containing exons have a significantly lower average retention ratio than alternatively spliced internal exons that do not contain Alu. These findings imply that Alu splice-like sites that had evolved into strong constitutive splice sites were most probably selected against because of their interference with normal protein production. In contrast, mutational changes in Alu sequences resulting in the creation of weak splice sites are tolerated, especially if their retention ratio is low. There are several documented genetic diseases caused by a mutation that led to the creation of a strong splice site in an otherwise normal intronic Alu. For example, a G→C mutation in an Alu sequence within intron 3 of ornithine δ-aminotransferase (OAT), caused the creation of a strong donor site, consequently leading to the constitutive insertion of a novel Alu exon between exons 3 and 4. The insertion caused an in-frame stop codon, which led to OAT deficiency (Mitchell et al. 1991; Makalowski 2000). This is an example of the possible deleterious effect of Alu-containing exons that has become constitutively inserted within a transcript.

According to our data, older subfamilies (monomers and Alu-J) are over-represented in the set of Alu-containing exons compared with their distribution in the genome (Table (Table2).2). Since, by definition, members of older subfamilies were retroposed to the human genome earlier than members of newer subfamilies, they had more time to diverge from the Alu ancestor. Members of the Alu-J subfamilies show ~86% identity to the Alu consensus sequence, whereas members of the Alu-S subfamilies show ~92%–93% identity (Kapitonov and Jurka 1996). Therefore, the bias toward older subfamilies in the set of Alu-containing exons may reflect the number of substitutions needed to create a functional splice site within the retroposed Alu sequence to allow for its exonization.

Another possibility that would explain the fact that we did not find constitutive Alu-containing internal exons is that old Alu-containing internal exons that became fixed show only a poor similarity to the consensus Alu sequence, and, therefore, could no longer be recognized by similarity searches as Alu derived.

We have chosen to focus on alternative splicing events of the exon-skipping type for two reasons. First, this type is the most frequent type of alternative splicing (Hide et al. 2001). Second, many unspliced ESTs found in the ESTs database (dbEST) represent sequenced introns (intron contamination) and contain Alu sequences, and, therefore, we preferred not to use unspliced expressed sequences as evidence for alternative splicing. In the exon-skipping type of alternative splicing, both variants are spliced — the skipping variant contains a large intron that skips the alternative internal exon, and the variant containing the exon has two introns, one on each of the alternatively spliced internal exon's sides.

Due to the strict nature of our search, not all alternatively spliced internal exons were retrieved, and, therefore, not all documented Alu-containing exons appear in our database. We have taken only exons flanked by true introns on both sides. A true intron was defined as an intron abiding by the GT/AG, GC/AG, or AT/AC rules, without any of its nucleotides covered by an expressed sequence. Due to the large number of ESTs that represent intron contamination and align to places in the genome that are normally introns, many true exon-skipping cases were most probably disregarded in our study. In the same manner, our database of constitutively spliced internal exons is probably only a fraction of the complete set of constitutively spliced internal exons in the genome, because, in addition to the demand that the exon will be flanked by true introns, we have taken into account only exons covered by at least four sequences. Finally, we examined only genes for which the cDNA was deposited in GenBank, disregarding clusters made entirely of ESTs.

The literature describes numerous individual studies in which Alu insertions were found within an mRNA. The vast majority of these cases are described as splice variants, with another splice variant that does not contain the Alu insertion in evidence. In the literature, we found two instances of internal Alu-containing exons that were reported to be found in all detected splice variants. Neither case appears in either our dataset of constitutive exons or in the alternative exons dataset. The reason for these exclusions was the alignment of intron-contaminated ESTs to these two loci. We have searched manually for ESTs matching these two loci. The human hematopoietic progenitor kinase (HPK1) contains an Alu-derived peptide in its carboxyl terminus. This Alu insertion was reported previously as fixed, that is, the Alu was present in all transcripts (Hu et al. 1996; Nekrutenko and Li 2001). We found 25 ESTs that skip the Alu-containing exon (exon 26), whereas only three sequences (two of them were mRNAs) contained the exon (data not shown). The zinc finger gene ZNF177 has been reported to contain both an Alu and an L1 fragment in the constitutively spliced exon 4 (Baban et al. 1996; Landry et al. 2001). Apart from the two mRNAs reported by (Baban et al. 1996), we failed to find a single EST that may be used to determine whether or not this exon is really constitutive. However, we predict that splice variants that do not contain this exon will be discovered in the future.

We have shown that exonized Alu elements are alternatively spliced. Thus, Alu elements have the evolutionary potential to enhance the coding capacity and regulatory versatility of the genome without compromising its integrity.


The Gencarta Database and its LEADS output was licensed from Compugen Ltd. (http://www.cgen.com). Briefly, the LEADS output was created as follows. ESTs and cDNAs from GenBank version 121 were cleaned from terminal vector sequences, and low-complexity stretches and repeats in the expressed sequences were masked. Sequences with internal vector contamination and sequences identified as immunoglobulins or T-cell receptors were discarded. In the next stage, expressed sequences were heuristically compared with the genome to find likely high-quality hits. They were then aligned to the genome by use of a spliced alignment model that allows long gaps. Only sequences having >94% identity to a stretch in the genome were used in further stages. Sequences having hits to more than one locus in the genome were analyzed to choose the correct locus, taking into account percent identity and intron content (to differentiate between genes and processed pseudogenes). Sequences mapping to two or more chromosomes, or sequences in which the inferred introns were longer than 400,000 were discarded as suspected chimeras. Low-quality sequence ends that disagreed with the DNA were trimmed. In the clustering and assembly stage, overlapping expressed sequences and corresponding genomic sequences were multiply aligned. Positions on the genomic sequence in which there is at least one sequence that opens or closes a long gap were considered splice sites. Where possible, long gaps begin with a GT or GC dinucleotide and end with an AG dinucleotide. The resulting multiple alignment is represented as a directed graph, in which each vertex represents the multiple alignment of sequences between two detected splice sites. An edge exists between two vertices if at least one sequence continues from the first multiple alignment to the second. Every sequence has a hyperedge consisting of the vertices through which it passes.

The 13,097 clusters that contained at least 4 expressed sequences, of which at least 1 was a cDNA sequence, were selected for the internal-exon search. An intron was defined as a vertex containing only the genomic sequence, and a true intron as an intron abiding by the GT/AG, GC/AG, or AT/AC rules. An exon was defined as a vertex containing at least one expressed sequence, and an internal exon was defined as an exon embedded between two true introns. Substructures of the cluster containing three exons separated by two introns, in which the second exon is an internal exon, were searched. An internal exon was classified as an alternatively spliced internal if there was at least one sequence that contained the three exons, and one sequence that contained both flanking exons, but skipped the middle one. A constitutively spliced internal exon was defined as an internal exon covered by at least four sequences, for which no alternative splicing was observed. The search was limited to exons shorter than 400 bases.

Constitutive and the alternative exons were searched using the PERL programs GetConstitutiveExons.pl and GetAlternativeCassetteExons.pl, respectively (http://www.kimura.tau.ac.il/~rotem/ALU/). Packages used by these programs for parsing the LEADS output, compiled for SUN architecture, can be downloaded from http://www.cgen.com/parse_LEADS. Exons datasets can be downloaded from http://www.kimura.tau.ac.il/~rotem/ALU. Both exons datasets were compared with the NCBI Alu database (ftp://ncbi.nlm.nih.gov/pub/jmc/alu/, (Claverie and Makalowski 1994)) using the BLASTn program with default parameters. Genomic sequences near the Alu-containing exons were extracted from LEADS clusters using the GetAlternativeCassetteExons.pl program. Exon-intron structures of genes containing Alu exon were double checked using the Sim4 Program for spliced alignment (Florea et al. 1998). Isoelectric point was predicted using the Expasy online service http://www.expasy.ch/tools/pi_tool.html. Location in mRNA and influence on protein-coding regions were inferred manually from GenBank annotations. Data for alternatively spliced internal exons from chromosome 22 were calculated from Table 2 in Hide et al. (2001). Alu subfamilies, orientation, and borders on the genomic sequence were determined using RepeatMasker (http://repeatmasker.genome.washington.edu/cgi-bin/RepeatMasker). Subfamilies of genomic Alu sequences were inferred from http://dir.niehs.nih.gov/ALU/map (Stenger et al. 2001).


ftp://ncbi.nlm.nih.gov/pub/jmc/alu/; The NCBI Alu database.

http://dir.niehs.nih.gov/ALU/map; Database of Alu elements in the human genome from Stenger et al. (2001).

http://repeatmasker.genome.washington.edu/cgi-bin/RepeatMasker; The RepeatMasker program by Smit and Green.

http://www.cgen.com; Compugen home page.

http://www.expasy.ch/tools/pi_tool.html; A tool that computes isoelectric point (pI) and molecular weight (Mw).

http://www.kimura.tau.ac.il/~rotem/ALU/; Supplementary material from corresponding author.


We thank Dr. Galit Rotman for valuable review and discussion. We also thank the Compugen LEADS team for help in various productions.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.


E-MAIL li.oc.negupmoc@metor; FAX +972-3-6409403.

Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.229302.


  • Baban S, Freeman JD, Mager DL. Transcripts from a novel human KRAB zinc finger gene contain spliced Alu and endogenous retroviral segments. Genomics. 1996;33:463–472. [PubMed]
  • Barnett TR, Drake L, Pickle W. Human biliary glycoprotein gene: Characterization of a family of novel alternatively spliced RNAs and their expressed proteins. Mol Cell Biol. 1993;13:1273–1282. [PMC free article] [PubMed]
  • Batzer MA, Deininger PL, Hellmann-Blumberg U, Jurka J, Labuda D, Rubin CM, Schmid CW, Zietkiewicz E, Zuckerkandl E. Standardized nomenclature for Alu repeats. J Mol Evol. 1996;42:3–6. [PubMed]
  • Caras IW, Davitz MA, Rhee L, Weddell G, Martin DW, Jr, Nussenzweig V. Cloning of decay-accelerating factor suggests novel use of splicing to generate two proteins. Nature. 1987;325:545–549. [PubMed]
  • Claverie JM, Makalowski W. Alu alert. Nature. 1994;371:752. [PubMed]
  • Deutsch M, Long M. Intron-exon structures of eukaryotic model organisms. Nucleic Acids Res. 1999;27:3219–3228. [PMC free article] [PubMed]
  • Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W. A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. 1998;8:967–974. [PMC free article] [PubMed]
  • Hamdi H, Nishio H, Zielinski R, Dugaiczyk A. Origin and phylogenetic distribution of Alu DNA repeats: Irreversible events in the evolution of primates. J Mol Biol. 1999;289:861–871. [PubMed]
  • Hide WA, Babenko VN, van Heusden PA, Seoighe C, Kelso JF. The contribution of exon-skipping events on chromosome 22 to protein coding diversity. Genome Res. 2001;11:1848–1853. [PMC free article] [PubMed]
  • Hu MC, Qiu WR, Wang X, Meyer CF, Tan TH. Human HPK1, a novel human hematopoietic progenitor kinase that activates the JNK/SAPK kinase cascade. Genes & Dev. 1996;10:2251–2264. [PubMed]
  • International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. [PubMed]
  • Jurka J, Milosavljevic A. Reconstruction and analysis of human Alu genes. J Mol Evol. 1991;32:105–121. [PubMed]
  • Kapitonov V, Jurka J. The age of Alu subfamilies. J Mol Evol. 1996;42:59–65. [PubMed]
  • Landry JR, Medstrand P, Mager DL. Repetitive elements in the 5′ untranslated region of a human zinc- finger gene modulate transcription and translation efficiency. Genomics. 2001;76:110–116. [PubMed]
  • Li WH, Gu Z, Wang H, Nekrutenko A. Evolutionary analyses of the human genome. Nature. 2001;409:847–849. [PubMed]
  • Makalowski W. Genomic scrap yard: How genomes utilize all that junk. Gene. 2000;259:61–67. [PubMed]
  • Makalowski W, Mitchell GA, Labuda D. Alu sequences in the coding regions of mRNA: A source of protein variability. Trends Genet. 1994;10:188–193. [PubMed]
  • Mighell AJ, Markham AF, Robinson PA. Alu sequences. FEBS Lett. 1997;417:1–5. [PubMed]
  • Mitchell GA, Labuda D, Fontaine G, Saudubray JM, Bonnefont JP, Lyonnet S, Brody LC, Steel G, Obie C, Valle D. Splice-mediated insertion of an Alu sequence inactivates ornithine δ-aminotransferase: A role for Alu elements in human mutation. Proc Natl Acad Sci. 1991;88:815–819. [PMC free article] [PubMed]
  • Nekrutenko A, Li WH. Transposable elements are found in a large number of human protein-coding genes. Trends Genet. 2001;17:619–621. [PubMed]
  • Schmid CW. Alu: Structure, origin, evolution, significance and function of one- tenth of human DNA. Prog Nucleic Acid Res Mol Biol. 1996;53:283–319. [PubMed]
  • Shoshan A, Grebinskiy V, Magen A, Scolnicov A, Fink E, Lehavi D, Wasserman A. Designing oligo libraries taking alternative splicing into account. In: Bittner ML, Chen Y, Dorsel AN, Dougherty ED, editors. Microarrays: Optical Technologies and Informatics, Proc SPIE. Vol. 4266. Bellingham, WA: SPIE; 2001. pp. 86–95.
  • Stenger JE, Lobachev KS, Gordenin D, Darden TA, Jurka J, Resnick MA. Biased distribution of inverted and direct Alus in the human genome: Implications for insertion, exclusion, and genome stability. Genome Res. 2001;11:12–27. [PubMed]
  • Szmulewicz MN, Novick GE, Herrera RJ. Effects of Alu insertions on gene function. Electrophoresis. 1998;19:1260–1264. [PubMed]
  • Yulug IG, Yulug A, Fisher EM. The frequency and position of Alu repeats in cDNAs, as determined by database searching. Genomics. 1995;27:544–548. [PubMed]

Articles from Genome Research are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Gene (nucleotide)
    Gene (nucleotide)
    Records in Gene identified from shared sequence links
  • MedGen
    Related information in MedGen
  • Nucleotide
    Published Nucleotide sequences
  • OMIM
    OMIM record citing PubMed
  • Protein
    Published protein sequences
  • PubMed
    PubMed citations for these articles
  • Taxonomy
    Related taxonomy entry
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...