Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Jun 21, 2005; 102(25): 9068–9073.
Published online Jun 10, 2005. doi:  10.1073/pnas.0502923102
PMCID: PMC1157042
From the Cover
Plant Biology

Gene movement by Helitron transposons contributes to the haplotype variability of maize


Different maize inbred lines are polymorphic for the presence or absence of genic sequences at various allelic chromosomal locations. In the bz genomic region, located in 9S, sequences homologous to four different genes from rice and Arabidopsis are present in line McC but absent from line B73. It is shown here that this apparent intraspecific violation of genetic colinearity arises from the movement of genes or gene fragments by Helitrons, a recently discovered class of eukaryotic transposons. Two Helitrons, HelA and HelB, account for all of the genic differences distinguishing the two bz locus haplotypes. HelA is 5.9 kb long and contains sequences for three of the four genes found only in the McC bz genomic region. A nearly identical copy of HelA was isolated from a 5S chromosomal location in B73. Both the 9S and 5S sites appear to be polymorphic in maize, suggesting that these Helitrons have been active recently. Helitrons lack the strong predictive terminal features of other transposons, so the definition of their ends is greatly facilitated by the identification of their vacant sites in Helitron-minus lines. The ends of the 2.7-kb HelB Helitron were discerned from a comparison of the McC haplotype sequence with that of yet a third line, Mo17, because the HelB vacant site is deleted in B73. Maize Helitrons resemble rice Pack-MULEs in their ability to capture genes or gene fragments from several loci and move them around the genome, features that confer on them a potential role in gene evolution.

Keywords: genome variability, Helitrons, bz locus, corn, polymorphisms

Recent studies have uncovered exceptional haplotype variability in maize. Comparison of the bz genomic region of McC (1, 2), a line that had been used extensively in genetic analyses, with that of the unrelated standard inbred B73 (3) revealed unexpected differences between them. First, retrotransposon clusters, which make up the bulk of the maize genome (46), differed in composition and location relative to the genes in the region so that the two sequences could be aligned only at the genes they had in common. Second, and most strikingly, some genes present in McC were absent from B73, indicating that genetic colinearity was violated within the species. Noncolinear haplotypes were also found in a comparison of the genomic intervals containing the z1C zein gene cluster in B73 and BSSS53, inbred lines derived from the same synthetic population (7). The lengths of the z1C regions in the two inbreds varied by 50% because of differences in the number of zein and other genes and in the sizes of the retrotransposon clusters flanking them. Similar extensive nonhomologies were reported between the allelic regions of inbreds B73 and Mo17 at three additional chromosomal locations in the genome (8). That study established that more than one-third of the predicted genes were present in just one inbred at the loci examined, although many of the unshared genes appeared to be truncated. The observation that genes not shared between inbreds violate the maize–rice colinearity usually displayed by shared genes prompted the authors to speculate that unshared genes originated from insertions of a yet-unknown nature rather than deletions.

High intraspecific haplotype variability is not restricted to maize, having been recently described in barley, another species with a large amount of repetitive DNA. A comparison of the Rph7 locus in two barley cultivars established that colinearity was restricted to <35% of the two sequences, principally because of differences in retrotransposon blocks (9). Interestingly, a gene encoding a truncated helicase was present in only one of the two cultivars. On the other hand, no cases of gene acquisition or loss were found in a comparison of two different orthologous regions between rice subspecies (10, 11). This finding suggests that the type of variation detected in maize and barley may not be a general feature of plant genomes. The functional significance of the “plus–minus” type of variation is also unclear, because the genes that vary among accessions of the same species are present in multiple copies (3), and many of them are clearly pseudogenes or gene fragments (8, 9). Independent of its generality or functional significance, the described variation raises an important question: How did it arise? Evidence presented here indicates that the apparent intraspecific violations of genetic colinearity in maize and, probably, barley, arise from the movement of genes or gene fragments by Helitrons, a recently discovered type of eukaryotic transposon (12).

Helitrons were found by computational analysis of genomic sequences from Arabidopsis, rice, and Caenorhabditis elegans (12) and were later reported to be the causative agents of two spontaneous mutations in maize (13, 14). These transposons account for 2% of the genomes of Arabidopsis and C. elegans but had escaped detection because they lack structural features, such as terminal inverted repeats or target site duplications, that can be easily detected by computer-assisted searches (15). Instead, the transposons have 5′-TC and 3′-CTRR termini, carry a 16- to 20-bp palindrome of variable sequence ≈10–12 bp upstream of the 3′ terminus, and insert invariably between host nucleotides A and T. The putative autonomous elements reconstructed from the Arabidopsis and rice genome sequences are large (5.5–15 kb) and encode proteins with homology to a DNA helicase and an ssDNA-binding protein. Although these proteins are not similar to known transposases, the predicted helicases share motifs with the replication-initiation proteins of rolling-circle (RC) replicons, which catalyze both cleavage and ligation of DNA. Hence, Helitrons were postulated to transpose by RC replication. However, the vast majority of Helitrons are nonautonomous elements that vary greatly in size and do not encode the set of proteins encoded by the putative autonomous element. Kapitonov and Jurka (12) argued that the Helitron's helicase and ssDNA-binding protein were likely to have evolved from host proteins recruited by ancestral RC transposons, because of the conservation of their exon–intron structure and their similarity to known host proteins. Along those lines, Feschotte and Wessler (15) proposed that the acquisition of host genes by RC elements must have occurred frequently enough to permit the eventual capture of useful genes or exons and viewed them as potential “exon-shuffling machines.”

The Helitrons recovered in the two spontaneous mutants of maize would appear to fit that view because both are large and carry fragments of several unrelated genes (13, 16). The two Helitrons described here, termed HelA and HelB, are also large and carry sequences from still other genes. There are several copies of these Helitrons in the maize genome. Together, HelA and HelB can account for all the genes that are present in the bz genomic region of McC but absent from the same region in B73. None of these genes appears to be related to those carried by the putative autonomous plant Helitrons and were most likely recruited from the host and moved around the genome by Helitron elements.

Materials and Methods

Identification, Sequencing, and Analysis of B73 Clone b0511I12. An ≈10-kb fragment containing the four genes (cdl1, hypro2, hypro3, and rlk) that are present in the bz genomic region of McC, but not of B73, was used as query to search the genome survey sequence (GSS) maize database of GenBank (Fig. 1A). Most of the sequences in this database are from the inbred B73. One of the highest-scoring hits in the blastn analysis was a 937-bp bacterial artificial chromosome (BAC) end sequence (GenBank accession no. CL205862) that had homology to the coding region of the predicted gene hypro2. That BAC end came from a clone (b0570E18) that had been anchored in the maize physical map (www.genome.arizona.edu). Based on the physical map, a subset of clones that overlapped with b0570E18 was selected. PCR experiments with template DNA from the selected BAC clones and primers designed according to the BAC end sequences were conducted to confirm the presence of the hypro2 sequence in these BACs and to determine which end of the b0570E18 corresponded to the hypro2 sequence. Clone b0511I12 was chosen for sequencing because the physical map and PCR experiment suggested that it had the hypro2 sequence in the middle. The BAC clone was sequenced by the shotgun sequencing strategy on an Applied Biosystems 3730xl DNA sequencer and analyzed as described in ref. 2.

Fig. 1.
Organization of the ≈10-kb gene sequences from the McC bz haplotype that are missing from the B73 bz haplotype. (A) Structure of the McC region in 9S (from ref. 3). Predicted genes are shown as pentagons pointing in the direction of transcription. ...

Characterization of a hypro2 cDNA Clone. The B73 cDNA clone (GenBank accession no. CO522311) was obtained from the University of Arizona (www.genome.arizona.edu/orders). A transposon minilibrary was made by following the manufacturer's (Finnzymes, Helsinki) instructions, and 10 randomly selected clones were sequenced from both ends. Sequencing reactions were performed with the ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction kit V3.1 (Applied Biosystems) and analyzed on an Applied Biosystems 3730xl DNA sequencer.

PCR Amplification and DNA Sequencing. The 5′ end of HelA2 and its flanking sequence were amplified from total genomic DNA of the McC line by using primer pairs cdl3/2hp-up1 and 2hp8/2hp-up2. The corresponding sequences at the 3′ end were amplified with primer pairs cdl-up3/cdl7. All amplification reactions were carried out with the Expand High Fidelity PCR system (Roche). The PCR products were sequenced as described above. The following primers were used: cdl-3, ACATGGTTCCATCCACGCTT; 2hp-up1, GGTCTGTCGACTACGTTCCTT; cdl-up3, GCATCGCTGCATCAATGTCGAA; cdl7, GCAGTACAGGAGACTCGTA; 2hp8, TACAGGCACGCAGGAGCGTAGAA; and 2hp-up2, GGCTAACTGGCATGCTCTGTA.

Sequence Data Deposition. The sequences described here have been deposited in GenBank under the following accession nos.: B73 clone b0511I12, AC159612; B73 hypro2 cDNA, DQ000639; and McC HelA2, DQ003206.


The Genes Absent from the B73 bz Genomic Region Are Present Elsewhere in the Genome. Fu and Dooner (3) found that the genes exhibiting plus–minus variation in the bz genomic region were members of multigene families, i.e., that other copies were present elsewhere in the genomes of B73 and various maize inbreds. A major question raised by that finding is whether those genes are also adjacent to each other at the other locations. To obtain a preliminary answer to this question, a blastn analysis of the GenBank maize GSS database was conducted by using as query the 9.9-kb fragment from McC that contains the sequences from the four genes that were found to be absent from the bz genomic region of B73 (Fig. 1 A). These four genes were originally assigned tentative designations based on their homology to other plant genes encoding either confirmed or predicted proteins: cdl, a cell-division-like protein; hypro2 and hypro3, Arabidopsis hypothetical proteins; and rlk, a receptor-like kinase (1). The maize GSS database contains mostly sequences from the inbred B73 and have been contributed by projects that use enrichment procedures for the gene space in the genome (17, 18). The database also contains the end sequences from ≈475,000 B73 BAC clones, many of which have been anchored to the genetic map (6).

The analysis revealed that sequences 99% identical to those in McC are present elsewhere in the genome of B73 and that at least some of the genes are adjacent to each other. Based on the sequence overlaps, cdl, hypro2, and part of hypro3, which is now known to have been misannotated (see The Genic Content of the Helitrons, below), would form one contig, and the rest of hypro3 and rlk would form a separate contig (Fig. 1B). To confirm the linked arrangement of cdl and hypro2 in B73, the corresponding genomic fragment was isolated. One of the overlaps in the cdl-hypro2 contig revealed by blast was provided by an end sequence from a B73 BAC clone (b0570E18) that had already been anchored to the genetic map, close to the tip of 5S (www.genome.arizona.edu). The BAC end sequence was virtually identical to the predicted coding region of hypro2. With that BAC as seed, a larger BAC contig was identified in the B73 physical map, and the BAC clone b0511I12 was selected for sequencing after determining, on the basis of PCR experiments, that it probably contained the entire cdl-hypro2 contig.

The Plus–Minus Variation of the bz Genomic Region Arises from Gene Movements Mediated by Helitrons. Analysis of the BAC sequence revealed that a gene island consisting of eight genes was flanked on either side by retrotransposon clusters of the type originally described at the Adh1 locus (19) and later found to be of general occurrence in the maize genome (2, 20, 21). As shown in Fig. 2, at the left end of the gene island, there is a 6,000-bp fragment that includes cdl and hypro2 and has >99% identity with a corresponding 5,869-bp fragment from the Bz-McC genomic region. The sequences differ only in 11 SNPs, nine short indels of 1–7 bp, and one larger indel of 133 bp. Thus, the intraspecies breakdown of colinearity reported in the maize bz genomic region (3) can be partly explained by the movement of cdl and hypro2 from one location of the maize genome to another. Evidence presented below supports the argument that the movement is mediated by Helitrons. Hence, we have designated the element carrying cdl and hypro2 as HelA, the version of the element in 9S as HelA-1, and the one in 5S as HelA-2.

Fig. 2.
Organization of a 50-kb sequence from the B73 chromosome 5 BAC b0511I12 (GenBank accession no. AC159612). A gene island is flanked on either side by retrotransposon ...

An examination of their end sequences revealed that both fragments have typical Helitron features (Fig. 3), beginning with TC and ending with CTAG, sequence motifs that occur at the 5′ and 3′ termini, respectively, of Helitrons. Both have palindromic sequences 11 bp upstream of the 3′ terminus and are flanked by an A at the 5′ terminus and a T at the 3′ terminus. An alignment of their terminal sequences with those of previously described maize Helitrons reveals strong sequence conservation. Their 5′-terminal 12 bp, which are identical to each other, match those of the Helitron insertions in the sh2-7527 mutation (13) and the ba1-Ref mutation (14, 16) at 11 of the 12 positions. Their 3′-terminal 30 base pairs, which include the palindrome and differ from each other at only one site, are also conserved: the HelA-1 terminus matches those of the Helitrons in sh2 and ba1 at 26 and 24 positions, respectively. More importantly, the sequences flanking the insertion in the McC bz genomic region are identical to sequences flanking an AT dinucleotide in the bz genomic region of B73 (3) and of other lines, such as Mo17 (8) and A188 (Q. Wang and H.K.D., unpublished work), which also lack cdl and hypro2 at that location. The most parsimonious explanation for the polymorphism detected in today's modern inbreds is that the unoccupied site was never visited. This explanation is also in agreement with the current view that RC transposition does not result in excision of the element from the donor site (22). However, because the actual mechanism of transposition of Helitrons is not known, the sequence present in the bz genomic region of B73, Mo17, and A188 will be referred to as the “vacant site,” to leave open the possibility that a Helitron may have resided there at one time in the past and to distinguish it from the footprint-bearing “empty sites” produced by the excision of most class II DNA transposons.

Fig. 3.
Termini of the maize Helitrons HelA-1 and HelB from McC 9S, HelA-2 from B73 5S, the Helitron insertions in mutants sh2-7527 (13) and ba1-Ref (14, 16), and the rice Helitron2_OS (12). Helitron sequences are in uppercase letters and the invariant host nucleotides ...

Maize lines are clearly polymorphic for the presence vs. absence of HelA-1 in 9S. The possibility that a similar type of polymorphism also occurs for HelA-2 in 5S was investigated by PCR. Primers based on sequences internal to and flanking HelA-2 in 5S were used to amplify the junctions between HelA-2 and 5S sequences in line McC (which is essentially W22, except for the bz genomic region) and a series of other Corn Belt inbreds. Positive amplification products were obtained from McC, W22, A188, A636, B73, H99, M14, Mo17, BSSS53, and 4Co63 but not from W23, suggesting that, similar to its counterpart in 9S, the 5S site is polymorphic for the presence vs. absence of a Helitron. In fact, a band for the vacant site could be amplified only with primers flanking HelA-2 from the inbred W23. The sequence of the PCR product from McC (GenBank accession no. DQ003206) confirmed that the correct junction sequences between HelA-2 and 5S had been amplified and established that the HelA-2 Helitrons of B73 and McC are 99.8% identical to each other. Thus, McC has a copy of HelA in 9S and another one in 5S, as does W22 (L. He and H.K.D., unpublished work). Most of the Corn Belt lines examined have the copy of HelA in 5S, but lack the one in 9S (3). W23 has a copy in 9S (3) but not in 5S.

Upon discovering that a Helitron accounted partly for the difference in apparent gene content of the bz genomic region of different inbred lines, an attempt was made to determine whether the plus–minus variation for the other gene sequences unique to the McC haplotype could also be explained by Helitron movement. Sequence alignments were performed between the variable genomic regions of McC and those of B73 and Mo17, both of which lack all four genes in the region (3, 8). The latter comparison proved fruitful. A 2,712-bp sequence was identified to be present in McC and absent from the corresponding location in Mo17. Similar to the sequence found in HelA, this sequence has typical Helitron features, so it has been called HelB. It begins with TC, ends with CTAG, is flanked by A and T residues at the 5′ and 3′ ends, respectively, and has an 8-bp palindromic sequence 10 bp upstream of the 3′ end (Fig. 3). Its termini are less related to those of previously described maize Helitrons than to those of HelA and appear to be closer to those of rice Helitron2 (12). An exact vacant site was identified in Mo17 but not in B73 or A188 (Q. Wang and H.K.D., unpublished work), haplotypes that may have suffered a deletion in this region (Fig. 4). The sequence separating the two Helitrons in McC is short, just 892 bp. Together, then, HelA and HelB can account for all the genes found to be present in the bz genomic region of McC but not of B73 (Fig. 1C).

Fig. 4.
The Helitron structure of bz haplotypes McC, B73, and Mo17. The Helitrons are shown as downward-pointing triangles, and the corresponding vacant sites in B73 and Mo17 are represented by thick short vertical lines. The vacant site for HelB is missing in ...

The Genic Content of the Helitrons. Fu et al. (1) concluded, on the basis of the differential hybridization of specific gene probes to RNA from either wild-type or a deletion mutant, that at least some of the genes now known to be carried by Helitrons were expressed. However, because these genes are members of multiple gene families, unambiguous evidence for their expression can be provided only by the isolation of their respective cDNAs.

A cDNA clone (GenBank accession no. CO522311) with 100% identity to the B73 hypro2 gene was identified in the maize EST database, sequenced in its entirety, and confirmed to be derived from the hypro2 gene of HelA-2 on the basis of its almost complete sequence identity (only one mismatch in 1,586 bp). The inferred hypro2 exon–intron structure is shown in Fig. 1D. The transcript begins close to the 5′ end of HelA, spans more than 3.5 kb of HelA sequence, and helps to define five exons, with exons 2 and 3 being separated by a large 1.8-kb intron. Conceptual translation of the transcript revealed premature stop codons in all reading frames, indicating that the cDNA clone does not encode a functional protein. Furthermore, careful examination of the gene's exon–intron structure showed it to be chimeric: exons 3–5 correspond to exons 6–8 of a putative glycosyl hydrolase (GH) in rice [National Center for Biotechnology Information (NCBI) protein database accession no. BAD36734] and Arabidopsis (accession no. BAB09947); exon 2 corresponds to exon 2 of the GH and exon 1 is of unknown origin. A 1.8-kb intron separates exon 2 from exon 6, but the sequences for the intervening GH exons 3–5 are completely missing from the genomic DNA. Thus, although hypro2 is expressed, it is clearly a pseudogene.

No cDNAs corresponding to the three other genic sequences in HelA or HelB have been recovered, either from McC cDNA libraries (1) or by RT-PCR using mRNA templates from a diversity of McC and B73 tissues (data not shown). A reexamination of the structure of the predicted cdl gene shows that it, too, consists of the terminal exons of a gene with multiple exons. The cdl sequence is homologous to exons 8–10 from a family of genes encoding a cell-division-like protein in rice (NCBI protein database accession no. BAD53799) and Arabidopsis (accession no. AAN86163). These exons, separated by their respective introns, are present at the 3′ end of HelA1, in the orientation opposite that of hypro2. As originally annotated (1), the hypro3 gene spanned sequences that are now known to be split between HelA and HelB, but a reexamination of the sequence reveals that hypro3 is shorter and contained entirely within HelA. The hypro3 fragment is actually found within the large intron of hypro2, in the opposite transcriptional orientation, and contains coding information for a truncated protein with high similarity to a rice putative serine protease (NCBI protein database accession no. BAD82560) and an Arabidopsis hypothetical protein (accession no. BAB11289). The genes encoding both of these proteins consist of five exons, of which exons 2, 3, and 4 and part of 5 are present in HelA. Finally, the rlk gene of HelB contains only part of the first exon of a two-exon gene annotated as a putative receptor-like protein kinase in rice (NCBI protein database accession no. BAA94519) and Arabidopsis (accession no. AAO64924). Thus, HelA and HelB resemble the two previously described Helitron elements of maize in carrying only gene fragments.


Basis of the Plus–Minus Variation. Data presented here establish that the differences in the apparent gene content of the bz genomic regions of two Corn Belt inbreds reported by Fu and Dooner (3), and referred to as plus–minus variability, are not because of deletions from the (–) line, as originally thought, but because of additions in the (+) line. The additions have been caused by the movement of complex Helitron transposons, which ferry gene fragments from one location of the maize genome to another. Hence, differences between allelic haplotypes do not appear to have the same basis as the differences between homeologous regions of the maize genome examined to date, which clearly originated by deletion (2325). Since its paleotetraploid origin (26, 27), maize has been undergoing extensive gene loss as it approaches a diploid state. In most cases, entire genes have been deleted, although partial gene deletions have been documented (23). The gene loss or fractionation of the duplicated regions has been extensive. At the adh1 (23) and lg2 (25) orthologous regions, fractionation is nearly complete. In a comprehensive study of five other duplicated regions from different chromosomal locations, Lai et al. (24) found that at least 50% of the duplicated genes from the two orthologues had been lost over a very short period of time, estimated to be as short as 5 million years. Thus, the haplotype noncolinearity described at the bz locus is not simply a residual manifestation of this reduction to diploidy but an altogether different phenomenon that occurred more recently.

The reported noncolinearity between the bz-locus haplotypes in McC and B73 arose from the independent insertion of two different Helitrons ≈900 bp away from each other at a location just distal to bz in 9S. Both Helitrons are present in line McC and absent from lines B73 (3) and Mo17 (8). Both share the following structural features of Helitrons (12): (i) they begin with a TC (5′ end) and end with CTAG (3′end), (ii) they have a 10- to 16-bp palindrome ≈11 bp upstream of the 3′ end, and (iii) they are inserted at an AT host dinucleotide. This site is referred to as the occupied site in McC and as the vacant site in B73 and Mo17.

The larger of the two Helitrons, HelA-1, is 5,869 bp long and carries in it fragments from three separate genes (cdl, hypro2, and hypro3) that have homology to rice and Arabidopsis genes. As documented in Results, none of these genes is complete. cdl and hypro3 are in one orientation and hypro2 is in the opposite orientation. Interestingly, hypro3 is contained within the long second intron of hypro2. An almost identical copy of this Helitron, termed HelA-2, was isolated from a chromosome 5 BAC clone of B73. HelA-2 is slightly longer (6,000 bp) as a consequence of a 133-bp indel polymorphism, yet its overall sequence is >99% identical to that of HelA-1. A 1.6-kb hypro2 transcript from HelA-2 was identified in the B73 EST collection, but the presence of premature stop codons in every reading frame indicates that this transcript does not encode a functional protein. Unlike most other transposons, Helitrons do not have clear terminal features, such as terminal inverted repeats or target-site duplication, that mark their limits. The availability, in this instance, of two closely related Helitron copies from two different genomic locations greatly facilitated the definition of the ends of the HelA transposon and the identification of the AT target dinucleotide in the B73 vacant site.

The smaller of the two Helitrons, HelB, is 2,712 bp long and carries in it a fragment of an rlk gene that has close homologues in rice and Arabidopsis. The vacant AT site for this Helitron is missing in B73 but present in Mo17. Thus, the ends of HelB could be determined only from an alignment of the McC and Mo17 genomic sequences. This comparison highlights the value of the vertical sampling of one genomic region for the precise identification of genomic sequences, such as those from complex Helitrons, that lack the strong structural features required for global genomic computational searches. A schematic diagram outlining the possible origin of an McC-type haplotype from a Mo17-type progenitor haplotype is presented in Fig. 5.

Fig. 5.
Diagram of the origin of an McC-type haplotype from a Mo17-type progenitor haplotype by the transposition of Helitrons. Helitrons HelA and HelB are represented as thick gray lines in the donor chromosomes. Putative RC-transposition intermediates are shown ...

Fu and Dooner (3) speculated that, if found to be common at other locations in the genome, the plus–minus variability uncovered at bz could contribute to the phenomenon of heterosis or hybrid vigor in maize. They reasoned that genes absent from certain polymorphic locations of the genome might be complemented by copies of those genes at other polymorphic locations. On the other hand, Song and Messing (7) showed that the level of expression of most zein genes that were present in one line and absent in another did not show simple additive patterns when the two lines were intercrossed. Recently, Brunner et al. (8) have found that plus–minus variability may be common throughout the maize genome. However, evidence presented here indicates that the sequences displaying that kind of variability are often gene fragments ferried around the genome by Helitron transposons. Therefore, this plus–minus variability, unlike that of the z1C1 locus, would contribute to heterosis, as envisioned originally, only when intact genes rather than gene fragments have been captured by Helitrons. Alternatively, the Helitron-mediated movement of large blocks of DNA into the vicinity of genes could lead to differences in gene expression from the placement of those genes within a novel sequence context.

Gene Movement by Helitrons. The Helitrons described here differ from those originally described in Arabidopsis, rice, and C. elegans, in that they lack sequences similar to replication protein A (RPA) and DNA helicases (12). They resemble, instead, other Helitron insertions previously described in maize. The Helitrons in the sh2-7527 (13) and ba1-Ref (14, 16) mutants and in a BAC clone of the 19-kDa zein gene family (16) are heterogeneous in size and contain portions of at least 11 different genes, none of which is related to RPA or DNA helicases. If, as suggested for helicase and RPA (12), these genes are being recruited from the host, the requirements for gene capture by nonautonomous maize Helitrons do not appear to be very stringent.

The capture of genes or gene fragments from the host by transposable elements has been documented in several plants (2832). The maize Helitrons share several features with the Pack-MULEs (mutator-like elements) recently described in rice (33): Most of the sequences captured are gene fragments, not complete genes; a single element can contain fragments from multiple genes; sequence acquisition is at the DNA level, as indicated by the conservation of introns; and transcripts can initiate within the element (e.g., hypro2) or outside of the element, producing chimeric transcripts (13, 16). Based on the above features of Pack-MULEs, their abundance, and the large fraction (one-fifth) that contain fragments from multiple loci, Jiang et al. (33) have argued that Pack-MULEs have the potential to create new plant genes through the multiplication, rearrangement, and fusion of fragments from multiple genomic loci. Although maize Helitrons have just begun to be characterized (13, 16), their properties shared with Pack-MULEs suggest that they have a similar potential.

The mechanism of host-sequence acquisition by Helitrons is not known, but Feschotte and Wessler (15) have proposed a model based on the observation that the transposition of RC replicons, such as IS91 (22), has minimal cis requirements. The model postulates that RC replication initiates correctly at the 5′ end but that the normal 3′ palindrome termination signal is bypassed, leading to the replication and capture of adjacent sequences until a new cryptic downstream palindrome is encountered that can serve as a terminator. The capture of either complete or partial gene sequences by this mechanism would depend on where in a gene the Helitron was inserted initially. Helitrons that lose their mobilization machinery would become nonautonomous elements, although, possibly, the majority of nonautonomous elements has a different origin. Given the apparently minimal requirements for transposition, nonautonomous Helitrons could be very small, as are the nonautonomous Ds1 and dTph elements in maize and petunia, respectively (34, 35). In fact, the abundant nonautonomous Helitron elements Helitrony2 and Helitrony3 of C. elegans are just 249 and 195 bp long, respectively. It is conceivable that most of the complex Helitrons of maize are not derived from autonomous elements but from these much more numerous defective elements, which could readily pick up adjacent host sequences in the presence of an autonomous element.


We thank members of the Dooner laboratory and the anonymous reviewers for constructive comments on the manuscript. This work was supported by National Science Foundation Grant DBI 03-20683 (to H.K.D. and J.M.).


Author contributions: J.L. and H.K.D. designed research; J.L., Y.L., and H.K.D. performed research; J.L., Y.L., J.M., and H.K.D. analyzed data; and H.K.D. wrote the paper.

This paper was submitted directly (Track II) to the PNAS office.

Abbreviations: BAC, bacterial artificial chromosome; GSS, genome survey sequence; RC, rolling-circle.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. AC159612, DQ000639, and DQ003206).


1. Fu, H., Park, W., Yan, X., Zheng, Z., Shen, B. & Dooner, H. K. (2001) Proc. Natl. Acad. Sci. USA 98, 8903–8908. [PMC free article] [PubMed]
2. Fu, H., Zheng, Z. & Dooner, H. K. (2002) Proc. Natl. Acad. Sci. USA 99, 1082–1087. [PMC free article] [PubMed]
3. Fu, H. & Dooner, H. K. (2002) Proc. Natl. Acad. Sci. USA 99, 9573–9578. [PMC free article] [PubMed]
4. SanMiguel, P. & Bennetzen, J. L. (1998) Ann. Bot. (London) 82, 37–44.
5. Meyers, B. C., Tingey, S. V. & Morgante, M. (2001) Genome Res. 11, 1660–1676. [PMC free article] [PubMed]
6. Messing, J., Bharti, A. K., Karlowski, W. M., Gundlach, H., Kim, H. R., Yu, Y., Wei, F., Fuks, G., Soderlund, C. A., Mayer, K. F. & Wing, R. A. (2004) Proc. Natl. Acad. Sci. USA 101, 14349–14354. [PMC free article] [PubMed]
7. Song, R. & Messing, J. (2003) Proc. Natl. Acad. Sci. USA 100, 9055–9060. [PMC free article] [PubMed]
8. Brunner, S., Fengler, K., Morgante, M., Tingey, S. & Rafalski, A. (2005) Plant Cell 17, 343–360. [PMC free article] [PubMed]
9. Scherrer, B., Isidore, E., Klein, P., Kim, J. S., Bellec, A., Chalhoub, B., Keller, B. & Feuillet, C. (2005) Plant Cell 17, 361–374. [PMC free article] [PubMed]
10. Song, R., Llaca, V. & Messing, J. (2002) Genome Res. 12, 1549–1555. [PMC free article] [PubMed]
11. Ma, J., Devos, K. M. & Bennetzen, J. L. (2004) Genome Res. 14, 860–869. [PMC free article] [PubMed]
12. Kapitonov, V. V. & Jurka, J. (2001) Proc. Natl. Acad. Sci. USA 98, 8714–8719. [PMC free article] [PubMed]
13. Lal, S. K., Giroux, M. J., Brendel, V., Vallejos, C. E. & Hannah, L. C. (2003) Plant Cell 15, 381–391. [PMC free article] [PubMed]
14. Gallavotti, A., Zhao, Q., Kyozuka, J., Meeley, R. B., Ritter, M. K., Doebley, J. F., Pe, M. E. & Schmidt, R. J. (2004) Nature 432, 630–635. [PubMed]
15. Feschotte, C. & Wessler, S. R. (2001) Proc. Natl. Acad. Sci. USA 98, 8923–8924. [PMC free article] [PubMed]
16. Gupta, S., Gallavotti, A., Stryker, G. A., Schmidt, R. J. & Lal, S. K. (2005) Plant Mol. Biol. 57, 115–127. [PubMed]
17. Whitelaw, C. A., Barbazuk, W. B., Pertea, G., Chan, A. P., Cheung, F., Lee, Y., Zheng, L., van Heeringen, S., Karamycheva, S., Bennetzen, J. L., et al. (2003) Science 302, 2118–2120. [PubMed]
18. Palmer, L. E., Rabinowicz, P. D., O'Shaughnessy, A. L., Balija, V. S., Nascimento, L. U., Dike, S., de la Bastide, M., Martienssen, R. A. & McCombie, W. R. (2003) Science 302, 2115–2117. [PubMed]
19. SanMiguel, P., Tikhonov, A., Jin, Y. K., Motchoulskaia, N., Zakharov, D., Melake-Berhan, A., Springer, P. S., Edwards, K. J., Lee, M., Avramova, Z. & Bennetzen, J. L. (1996) Science 274, 765–768. [PubMed]
20. Song, R., Llaca, V., Linton, E. & Messing, J. (2001) Genome Res. 11, 1817–1825. [PMC free article] [PubMed]
21. Ramakrishna, W., Emberton, J., Ogden, M., SanMiguel, P. & Bennetzen, J. L. (2002) Plant Cell 14, 3213–3223. [PMC free article] [PubMed]
22. del Pilar Garcillan-Barcia, M., Bernales, I., Mendiola, M. V. & de la Cruz, F. (2002) in Mobile DNA II, eds. Craig, N. L., Craigie, R., Gellert, M. & Lambowitz, A. M. (Am. Soc. Microbiol., Washington, DC), pp. 891–904.
23. Ilic, K., SanMiguel, P. J. & Bennetzen, J. L. (2003) Proc. Natl. Acad. Sci. USA 100, 12265–12270. [PMC free article] [PubMed]
24. Lai, J., Ma, J., Swigonova, Z., Ramakrishna, W., Linton, E., Llaca, V., Tanyolac, B., Park, Y. J., Jeong, O. Y., Bennetzen, J. L. & Messing, J. (2004) Genome Res. 14, 1924–1931. [PMC free article] [PubMed]
25. Langham, R. J., Walsh, J., Dunn, M., Ko, C., Goff, S. A. & Freeling, M. (2004) Genetics 166, 935–945. [PMC free article] [PubMed]
26. Gaut, B. S. & Doebley, J. (1997) Proc. Natl. Acad. Sci. USA 94, 6809–6814. [PMC free article] [PubMed]
27. Swigonova, Z., Lai, J., Ma, J., Ramakrishna, W., Llaca, V., Bennetzen, J. L. & Messing, J. (2004) Genome Res. 14, 1916–1923. [PMC free article] [PubMed]
28. Talbert, L. E. & Chandler, V. L. (1988) Mol. Biol. Evol. 5, 519–529. [PubMed]
29. Bureau, T. E., White, S. E. & Wessler, S. R. (1994) Cell 77, 479–480. [PubMed]
30. Jin, Y. K. & Bennetzen, J. L. (1994) Plant Cell 6, 1177–1186. [PMC free article] [PubMed]
31. Takahashi, S., Inagaki, Y., Satoh, H., Hoshino, A. & Iida, S. (1999) Mol. Gen. Genet. 261, 447–451. [PubMed]
32. Elrouby, N. & Bureau, T. E. (2001) J. Biol. Chem. 276, 41963–41968. [PubMed]
33. Jiang, N., Bao, Z., Zhang, X., Eddy, S. R. & Wessler, S. R. (2004) Nature 431, 569–573. [PubMed]
34. Gerlach, W. L., Dennis, E. S., Peacock, W. J. & Clegg, M. T. (1987) J. Mol. Evol. 26, 329–334. [PubMed]
35. Gerats, A. G., Huits, H., Vrijlandt, E., Marana, C., Souer, E. & Beld, M. (1990) Plant Cell 2, 1121–1128. [PMC free article] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • EST
    Published EST sequences
  • GSS
    Published GSS sequences
  • MedGen
    Related information in MedGen
  • Nucleotide
    Published Nucleotide sequences
  • Protein
    Published protein sequences
  • PubMed
    PubMed citations for these articles
  • Taxonomy
    Related taxonomy entry
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...