• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of geneticsGeneticsCurrent IssueInformation for AuthorsEditorial BoardSubscribeSubmit a Manuscript
Genetics. Jun 2007; 176(2): 1299–1306.
PMCID: PMC1894591

Transcriptional Interferences in cis Natural Antisense Transcripts of Humans and Mice

Abstract

For a significant fraction of mRNAs, their expression is regulated by other RNAs, including cis natural antisense transcripts (cis-NATs) that are complementary mRNAs transcribed from opposite strands of DNA at the same genomic locus. The regulatory mechanism of mRNA expression by cis-NATs is unknown, although a few possible explanations have been proposed. To understand this regulatory mechanism, we conducted a large-scale analysis of the currently available data and examined how the overlapping arrangements of cis-NATs affect their expression level. Here, we show that for both human and mouse the expression level of cis-NATs decreases as the length of the overlapping region increases. In particular, the proportions of the highly expressed cis-NATs in all cis-NATs examined were ~36 and 47% for human and mouse, respectively, when the overlapping region was <200 bp. However, both proportions decreased to virtually zero when the overlapping regions were >2000 bp in length. Moreover, the distribution of the expression level of cis-NATs changes according to different types of the overlapping pattern of cis-NATs in the genome. These results are consistent with the transcriptional collision model for the regulatory mechanism of gene expression by cis-NATs.

BIOLOGICAL processes such as development, metabolism, and response to external stimuli are conducted by the cooperative activities of many genes. To understand a biological process, it is essential to understand the regulatory network of genes composing the biological process. After genome sequences had been determined, attempts to reveal regulatory networks of genes were started (Wyrick and Young 2002; Encode Project Consortium 2004; Carninci et al. 2005; Levine and Davidson 2005). Regulation of gene expression can be conducted mainly by proteins such as transcription factors. However, it has been found that ~20–30% of mammalian transcripts are targets of microRNAs, which bind to complementary mRNAs and inhibit their activation (Krek et al. 2005; Lewis et al. 2005; Stark et al. 2005; Carthew 2006). This suggests that the regulation of gene expression by RNAs is more ubiquitous and important than we thought (Mattick 2001, 2004).

Cis natural antisense transcripts (cis-NATs) are composed of a pair of mRNAs that are transcribed from the opposite strands of DNA at the same genomic locus. The antisense mRNA regulates the expression level of the sense mRNA in a pair. As a result, cis-NATs affect the developmental processes such as neural, eye, and tooth formation (Potter and Branford 1998; Korneev et al. 1999; Alfano et al. 2005; Coudert et al. 2005; Korneev and O'Shea 2005), and various molecular functions such as X-inactivation, genomic imprinting, DNA methylation, RNA editing, and alternative splicing (Munroe and Lazar 1991; Kumar and Carmichael 1997; Moore et al. 1997; Lee et al. 1999; Tufarelli et al. 2003).

Although the number of experimentally verified cis-NATs was ~40, >2000 cis-NATs were predicted in the analyses of the genomic and cDNA sequences in humans and mice (Lehner et al. 2002; Shendure and Church 2002; Kiyosawa et al. 2003; Yelin et al. 2003; Katayama et al. 2005). Recently, an analysis of a human oligo microarray showed that as much as 60% of surveyed loci on human chromosome 10 were predicted to encode cis-NATs (Cheng et al. 2005). Other chromosomes were also expected to encode as many cis-NATs. In addition, the presence of cis-NATs has been predicted for other eukaryotes as well as prokaryotes (Wagner and Simons 1994; Vanhee-Brossollet and Vaquero 1998; Makalowska et al. 2005). These observations implied that the regulation of gene expression by cis-NATs would occur more frequently than previously considered. Regulation of gene expression by RNAs may be evolutionarily advantageous, because it regulates gene expression quickly and saves energy and time in synthesizing proteins. Chen et al. (2005b) showed that cis-NATs were encoded in genes with shorter intron sequences than other mRNAs.

Although a large number of cis-NATs have been predicted from various species, the regulatory mechanisms of gene expression by cis-NATs remain unclear. To understand the regulatory mechanisms, it will be crucial to know the features of cis-NATs that are important in the regulation of gene expression. Thus far, three models have been proposed for the regulation of gene expression by cis-NATs (Lavorgna et al. 2004).

The first model asserts that cis-NATs form a double strand through their complementary sequences, which leads to the inhibition of the function of mRNAs, including protein synthesis. In this model, it is expected that cis-NATs are overlapped in at least 6- to 8-bp regions to form stable double strands of RNA (Lai 2002; Lewis et al. 2005).

The second model involves epigenetic regulations such as the methylation of promoters and the conversion of the chromosome structure (Wutz et al. 1997; Reik and Walter 2001; Tufarelli et al. 2003). Through unknown mechanisms, the antisense mRNAs methylate promoters of sense mRNAs and inhibit the transcription of sense mRNAs. In addition, the antisense mRNAs convert the chromosomal structures in which cis-NATs are located and regulate the expression of the sense mRNAs. For the model of epigenetic regulations, the features of cis-NATs that are essential for the regulations are unclear.

The third model is transcriptional collisions. In the transcription of cis-NATs, RNA polymerases bind to the promoters of genes encoding sense and antisense mRNAs and synthesize mRNAs, moving toward the 3′-end of the genes. RNA polymerases clash in the overlapping region and inhibit their transcription (Figure 4). This model was implied from the analyses of the expression levels of cis-NATs in yeast (Peterson and Myers 1993; Puig et al. 1999; Prescott and Proudfoot 2002). In the analyses, the expression level of cis-NATs decreased as the length of the overlapping region of the cis-NATs increased. Moreover, the expression level of adjacent transcripts in tandem on the same strand of the yeast genome decreased when the terminator of the upstream transcript was removed. This suggested that RNA polymerases did not stop at the terminator of the upstream transcript and affected the transcription of the downstream transcript. Recently, the collision of Escherichia coli RNA polymerases was observed by atomic force microscopy (Crampton et al. 2006). This observation showed that RNA polymerases do not pass each other or displace one another, but instead stall against each other.

Figure 4.
Transcriptional collision model When cis-NATs are transcribed by RNA polymerases, RNA polymerases bind to the upstream region of a gene encoding a sense mRNA and synthesize the complementary mRNA, moving to the 3′-end of the gene. Similarly, RNA ...

Here, we report the effects of length and pattern of the overlapping regions of cis-NATs for humans and mice on their expression level. Moreover, human and mouse adjacent transcripts, in a particular position, affect their expression level. These results are consistent with the transcriptional collision model, implying that the regulation of the expression of cis-NATs by transcriptional collisions is common among species.

MATERIALS AND METHODS

Analysis of the expression level of human cis-NATs:

To predict cis-NATs from human cDNA sequences, we collected a total of 46,675 human cDNA sequences, which consisted of 39,530 human Ensembl cDNA sequences (February 2006) (Hubbard et al. 2005), 6501 human Ensembl non-protein-coding sequences (February 2006), and 644 non-protein-coding sequences in RNAdb (Pang et al. 2005) that were mapped to the human genome by BLAT software (Kent 2002). We predicted 8964 cis-NATs, which had an at least 1-bp-long overlapping region, on the basis of their genomic location using in-house Perl scripts. Redundant cis-NATs were merged into the same group when cis-NATs overlapped in an at least 1-bp-long region in the genome. The number of the groups of cis-NATs was 2496. To examine the expression levels of human cis-NATs, we employed 15.5 million NlaIII human serial analysis of gene expression (SAGE) tags (November 2005) of all tissues in the NCBI SAGEmap database (Lash et al. 2000). Human SAGE tags were searched against a total of 46,675 human cDNA sequences according to the protocol for SAGE (Velculescu et al. 1995) using in-house Perl scripts. When a SAGE tag matched more than one transcript, these transcripts were removed from further analyses. As a result, among the 46,675 human cDNA sequences, 28,009 had unique SAGE tag assignments, and among the 2496 groups of human cis-NATs, 728 groups had unique SAGE tag assignments to all transcripts in each group. Some SAGE tags are supposed to be assigned to overlapping regions of cis-NATs. Although cDNA microarrays cannot distinguish the expression of mRNAs encoded in plus and minus strands of the genome in the same locus, SAGE tags are strand specific. Therefore, even though SAGE tags are produced from the overlapping regions of cis-NATs, the expression levels of sense and antisense mRNAs of cis-NATs can be measured separately.

To examine the expression levels of the human cis-NATs, we compared the expression level of human cis-NATs with that of other human transcripts (i.e., human transcripts excluding cis-NATs, pseudogenes, and non-protein-coding mRNAs). We calculated the ratio of the expression level of a human cis-NAT to that of the other human transcripts. However, the expression levels of transcripts are known to be affected by the overall length of the transcripts (Castillo-Davis et al. 2002). Thus, we compensated the expression level of cis-NATs according to their overall length. The expression level of a human cis-NAT was compared with that of the other human transcripts with almost the same overall length of the cis-NAT. We removed human cis-NATs, pseudogenes, and non-protein-coding mRNAs from human cDNA sequences and selected 2000 human transcripts of which overall length in the genome was close to the overall length of a human cis-NAT in the genome. We calculated the median of the expression levels of the selected 2000 human transcripts (supplemental Figure 1A at http://www.genetics.org/supplemental/). The median of the expression levels of the selected human transcripts was defined as a ratio of 1.0, and then the ratio of the expression level of a cis-NAT was calculated as follows:

equation M1

where Rcis-NAT is ratio of the expression level of a human cis-NAT to that of other human protein-coding transcripts, Tcis-NAT is the expression level of a human cis-NAT, and T(all–cis-NAT–pseudo–noncoding) is the median of the expression level of other human transcripts (i.e., human transcripts excluding cis-NATs, pseudogenes, and non-protein-coding mRNAs) with almost the same overall length of the cis-NAT.

Sense mRNAs of cis-NATs are located not only in the plus or the minus strand of the genome, but also in both strands of the genome. Some of the sense mRNAs encode proteins and others encode non-protein-coding mRNAs that may have some biological function such as the regulation of mRNA translation and stability (Mattick and Makunin 2006). The antisense mRNA in each group of cis-NATs is known to decrease the expression level of the sense mRNA in the group and to inhibit the activation of the sense mRNA (Wagner and Simons 1994; Kumar and Carmichael 1998; Vanhee-Brossollet and Vaquero 1998). Therefore, we recognized the cis-NAT with the lowest expression level in each group as the sense mRNA in the group. In this study, we used the expression level of the sense mRNA in each group to examine the expression levels of cis-NATs according to the overlapping arrangements in the genome. However, there is an assumption that under some regulatory mechanisms of cis-NATs, such as RNA masking and a double-stranded RNA-dependent mechanism, the expression level of a sense mRNA may not be the lowest in the group of cis-NATs and an inverse correlation of the expression levels may not be found between a sense and an antisense mRNA (Chen et al. 2005a; Katayama et al. 2005; Lapidot and Pilpel 2006). Therefore, we also examined the expression level of cis-NATs randomly selected (i.e., selected in a non-expression-level-dependent manner) (see supplemental material at http://www.genetics.org/supplemental/).

We examined the expression levels of cis-NATs as the overlapping regions increased in length. When cis-NATs included more than two transcripts (i.e., sense and antisense transcripts on both strands of the genome), the length of the overlapping region was defined as the distance between the farthest upstream and the farthest downstream genomic locations of the overlapping regions of cis-NATs in a group.

Analysis of the expression level of mouse cis-NATs:

To predict the mouse cis-NATs, we collected a total of 35,486 mouse cDNA sequences, which consisted of 33,252 mouse Ensembl cDNA sequences (April 2006), 1752 mouse Ensembl non-protein-coding sequences (April 2006), and 482 non-protein-coding sequences in RNAdb that were mapped to the mouse genome using BLAT. We predicted 5491 cis-NATs from the mouse cDNA sequences and redundant cis-NATs were merged into the same group. There were 1868 groups of mouse cis-NATs. We examined the expression levels of mouse cis-NATs using 3.6 million NlaIII mouse SAGE tags of all tissues in the NCBI SAGEmap database (November 2005) to compare them with the expression levels of human cis-NATs. Among the 35,486 mouse cDNA sequences, 21,982 had unique SAGE tag assignments, and among the 1868 groups of mouse cis-NATs, 704 groups had unique SAGE tag assignments to all transcripts, including alternative forms in each group. We examined the expression levels of mouse cis-NATs in the same way we examined those of humans.

Comparison of cis-NATs of humans and mice at the nucleotide level:

To find cis-NATs conserved between humans and mice, we compared 46,675 human cDNA sequences with 35,486 mouse cDNA sequences and vice versa using the all-against-all FASTA (Pearson and Lipman 1988) procedure. An expected (E)-value cutoff of 1.0 × 10−20 was used. We selected the human and mouse cDNA sequences that matched reciprocally with an E-value less than the square root of the lowest E-value as candidates of orthologs. When human cis-NATs in both strands of the genome are orthologous to mouse cis-NATs in both strands of the genome, we recognized the cis-NATs as conserved between human and mouse.

RESULTS

Length of the overlapping region of human cis-NATs affects their expression level:

To predict cis-NATs from human cDNA sequences, we searched 46,675 human cDNA sequences. A total of 8964 cis-NATs were predicted and were clustered into 2496 groups, each of which consisted of sense and antisense transcripts as well as their alternative forms. To examine the expression levels of human cDNA sequences, ~15.5 million NlaIII SAGE tags were collected from all the human tissues available in the NCBI SAGEmap database (November 2005) (Lash et al. 2000) and were compared to 46,675 human cDNA sequences. Among the 8964 (2496 groups) cis-NATs, 2038 (728 groups) had unique SAGE tag assignments to all transcripts in each group.

To examine whether the length of overlapping regions in cis-NATs affects their expression level, we investigated the relationship between the expression levels of cis-NATs and the length of the overlapping region in the genome. It should be noted that Castillo-Davis et al. (2002) found that highly expressed transcripts tended to have short introns, implying that the short cis-NATs may be expressed at a higher level than the long cis-NATs. To eliminate the effect of the overall length of a transcript on its expression level, we selected nonoverlapping transcripts whose overall lengths in the genome were almost the same as that of a cis-NAT in the genome and then compared the expression level of the cis-NATs with that of the selected transcripts (see materials and methods).

In Figure 1, the y-axis represents the ratio of the expression level of the human cis-NATs to that of the selected human transcripts. The proportion of highly expressed cis-NATs (ratio >1.0) in all cis-NATs examined was 36% when the overlapping region was between 1 and 200 bp. However, the proportion decreased to virtually zero when the overlapping regions were >2000 bp long (chi-square P < 10−15). This result suggests that the expression level of cis-NATs decreases as the length of the overlapping region increases. In addition, the expression levels of cis-NATs may be influenced by other factors such as alternative transcripts of cis-NATs, the difference of the expression levels among SAGE tag libraries, GC content bias of SAGE tags (Margulies et al. 2001), the experimental methods for the analysis of gene expression, the ways of selecting sense mRNAs, the criteria for the length of the overlapping regions, and a set of human cDNA sequences used in this analysis. However, these factors did not change the overall distribution of the expression levels significantly (see supplemental material at http://www.genetics.org/supplemental/).

Figure 1.
Distribution of the expression level of human cis-NATs as the overlapping region in the genome increased in length. The x-axis shows the length of the overlapping exon and intron regions of human cis-NATs in the genome. The y-axis shows the ratio of the ...

Overlapping pattern of human cis-NATs affects their expression level:

Cis-NATs are classified into three types on the basis of overlapping patterns in the genome: head to head, tail to tail, and full overlap (Figure 2). “Full overlap” describes cis-NATs where the sense mRNA entirely overlaps within the antisense mRNA. The numbers of head-to-head, tail-to-tail, and full-overlap types of human cis-NATs were 254, 476, and 1766 groups of which 126, 230, and 356 groups had unique SAGE tag assignments to all transcripts in each group, respectively. To examine whether the overlapping pattern of cis-NATs affects their expression level, we analyzed the expression level of cis-NATs of humans according to the overlapping patterns in the genome. Figure 3, A–C, shows the expression levels of human cis-NATs in the head-to-head, tail-to-tail, and full-overlap manners, respectively. The highly expressed cis-NATs decreased in quantity as the overlapping region increased in length for all types of cis-NATs. However, highly expressed cis-NATs in head-to-head and full-overlap manners decreased in quantity more than those in a tail-to-tail manner did. When the length of the overlapping region was <600 bp, 26.7% of cis-NATs in a head-to-head manner showed high expression (ratio >1.0) and 43.4% of cis-NATs in a tail-to-tail manner showed high expression. The proportion of highly expressed cis-NATs in a head-to-head manner (26.7%) was 1.6 times smaller than that in a tail-to-tail manner (43.4%) (Mann–Whitney U-test: P < 10−2). Similarly, when the length of the overlapping region was <600 bp, 24.5% of cis-NATs in a full-overlap manner showed high expression. The proportion of highly expressed cis-NATs in a full-overlap manner (24.5%) was 1.7 times smaller than that in a tail-to-tail manner (43.4%) (Mann–Whitney U-test: P < 10−3). Among the 356 cis-NATs in a full-overlap manner, 314 were cis-NATs where a sense transcript overlapped only in the intron regions of the antisense transcript in the genome. The mRNAs that overlapped in the intron regions showed the same feature of expression.

Figure 2.
Classification of cis-NATs and nearby transcripts on the basis of their relative positions in the genome. Cis-NATs are classified on the basis of their relative positions in the genome: (a) cis-NATs in a head-to-head manner (5′-end to 5′-end), ...
Figure 3.
Distribution of the expression level of human cis-NATs according to overlapping patterns in the genome. The x-axis shows the length of the overlapping exon and intron regions of human cis-NATs in the genome. The y-axis shows the ratio of the expression ...

As many as 1450 human transcripts were found to be located within a distance of <1 kbp in the genome (Adachi and Lieber 2002; Koyanagi et al. 2005). To examine whether the expression levels of nearby transcripts decreased, we investigated the expression levels of human nearby transcripts. Figure 3D shows the expression levels of nearby transcripts where the 5′-end of a transcript is near the 5′-end of another transcript in the genome. Here, we call them “nearby transcripts in a head-to-head manner” (Figure 2). When the distance between the 5′-ends of transcripts was < ~50 bp, highly expressed transcripts (a ratio >1.0) were not observed (chi-square P < 10−7).

Figure 3E shows the expression levels of nearby transcripts where the 3′-end of a transcript is near the 3′-end of another transcript in the genome. Here, we call them “nearby transcripts in a tail-to-tail manner.” Contrary to nearby transcripts in a head-to-head manner, the expression levels of nearby transcripts in a tail-to-tail manner did not change, regardless of the distance of the nearby transcripts in the genome (chi-square P = 0.7).

There is a possibility that the length of some human transcripts registered in a database such as the Ensembl database may be shorter than natural transcripts (Makalowska et al. 2005). Nearby transcripts found in the Ensembl database may, in fact, overlap in the genome, such that the expression levels of such artificial nearby transcripts seemed to decrease. However, almost all nearby transcripts in a head-to-head manner were expressed at a low level when the distance of the transcripts was < ~50 bp. Therefore, the decrease in the expression level of nearby transcripts will be a natural phenomenon.

Overlapping arrangements of mouse cis-NATs affect their expression levels:

Cis-NATs have been predicted for various species (Wagner and Simons 1994; Vanhee-Brossollet and Vaquero 1998; Wagner and Flardh 2002; Makalowska et al. 2005). If the regulatory mechanisms of cis-NATs in gene expression are conserved among species, cis-NATs of another species are expected to show similar effects on their expression levels. To evaluate whether the relationship between the overlapping arrangements and the expression levels of cis-NATs is conserved among species, we examined the expression levels of mouse cis-NATs. Almost the same number of cis-NATs (1771 groups) as that of humans was found in mouse cDNA sequences (Kiyosawa et al. 2003; Yelin et al. 2003). Mouse cis-NATs were compared to 3.6 million NlaIII mouse SAGE tags in the NCBI SAGEmap database. Although the number of mouse SAGE tags and the number of mouse cis-NATs (705 groups) assigned to unique SAGE tags was smaller than for humans, the distribution of the expression levels of mouse cis-NATs showed the same features as that of humans (supplemental Figure 2A at http://www.genetics.org/supplemental/): highly expressed (ratio >1.0) cis-NATs decreased in quantity when the overlapping regions in the genome increased in length. The proportion of highly expressed cis-NATs in all cis-NATs examined was 47% when the overlapping region was between 1 and 200 bp, and the proportion decreased virtually to zero when the overlapping regions were >2000 bp long (chi-square P < 10−15).

For overlapping patterns in the genome, the distribution of the expression levels in mice changed in the same way as in humans (Mann–Whitney U-test: P = 0.52 between human and mouse cis-NATs in a head-to-head manner, P = 0.92 between those in a tail-to-tail manner, and P = 0.17 between those in a full-overlap manner) (supplemental Figure 2, B–D, at http://www.genetics.org/supplemental/). When the length of the overlapping region was <600 bp, 27.6% of cis-NATs in a head-to-head manner showed high expression (ratio >1.0) and 44.1% of cis-NATs in a tail-to-tail manner showed high expression. The proportion of highly expressed cis-NATs in a head-to-head manner (27.6%) was 1.6 times smaller than that of those in a tail-to-tail manner (44.1%) (Mann–Whitney U-test: P < 10−2). Similarly, 25.9% of cis-NATs in a full-overlap manner showed high expression (ratio >1.0) and the proportion of highly expressed cis-NATs in a full-overlap manner (25.9%) was 1.7 times smaller than that in a tail-to-tail manner (44.1%) (Mann–Whitney U-test: P < 0.05). With nearby transcripts, when the distance between nearby transcripts in a head-to-head manner was < ~50 bp, highly expressed transcripts were not observed as found in humans (chi-square P < 10−5) (supplemental Figure 2E at http://www.genetics.org/supplemental/). The expression levels of nearby transcripts in a tail-to-tail manner did not change, regardless of the distance of the nearby transcripts in the genome (chi-square P = 0.9) (supplemental Figure 2F at http://www.genetics.org/supplemental/). These results suggest that the expression levels of mouse cis-NATs are affected by the overlapping arrangements in the genome in the same way as those of humans. This implies that the regulatory mechanisms of cis-NATs in gene expression are conserved between humans and mice.

However, there was a possibility that cis-NATs showed a similar distribution of the expression level between humans and mice because most human and mouse cis-NATs were orthologous (Liao and Zhang 2006). To address this possibility, we compared the distribution of the expression levels of cis-NATs that are not conserved between human and mouse. First, we compared human and mouse cDNA sequences by using FASTA (Pearson and Lipman 1988) and found that only 329 groups (11.9%) of human cis-NATs were conserved in mouse cis-NATs, although 34,670 (76.0%) of human cDNA sequences were conserved in mice. Human and mouse cis-NATs were supposed to be highly divergent in terms of cDNA sequences (Veeramachaneni et al. 2004; Makalowska et al. 2005). We removed the 329 groups of cis-NATs from 2765 groups of human and 1704 groups of mouse cis-NATs, which left 710 groups of human and 605 groups of mouse cis-NATs with SAGE tags assigned to all transcripts in each group. We examined the expression level of the human and mouse cis-NATs according to the length and the pattern of the overlapping region in the genome. They showed almost the same distribution of the expression level as those including cis-NATs conserved between humans and mice (Mann–Whitney U-test: P = 0.22 and P = 0.88 for humans and mice, respectively). These results suggested that the similarity of the distribution of the expression level of human and mouse cis-NATs was not due to the conservation of the cDNA sequences of the cis-NATs.

DISCUSSION

We found that the expression level of cis-NATs changed according to the overlapping arrangements of cis-NATs in the human and mouse genomes. The expression level of cis-NATs decreased when the overlapping regions increased in length. Moreover, the overlapping pattern of cis-NATs affects their expression level. Nearby transcripts in a particular position decreased their expression levels.

Here, we examined the expression level of cis-NATs using SAGE tags and oligonucleotide arrays in public databases. We obtained the same distribution of the expression level of cis-NATs at least in the same tissue or cell such as fetal brain, embryonic stem cell, and liver (supplemental material and supplemental Figures 3 and 4 at http://www.genetics.org/supplemental/). However, all SAGE libraries and the expression data of oligonucleotide arrays were produced from cells and tissues of humans and mice, not from a single cell. In addition, some cis-NATs may be expressed only in some developmental stages or at a specific time. Therefore, some cis-NATs may not be expressed concurrently in the same single cell.

Thus far, three models have been proposed for the regulation of gene expression by cis-NATs. Forming double strands of cis-NATs requires a minimum 6- to 8-bp overlapping region of cis-NATs. We found that the expression level of human and mouse cis-NATs decreased consecutively as the length of the overlapping region increased. However, currently there is no report that this result is brought about by this model. In addition, this model does not intend to explain the change of the expression levels of nearby transcripts. Epigenetic regulations are also considered to be involved in the regulation of cis-NATs in gene expression. However, currently there is no report that by this model the expression level of human and mouse cis-NATs decreases consecutively as the length of the overlapping region increases.

Our results are consistent with the transcriptional collision model that has been proposed following the analyses of adjacent transcripts and cis-NATs in yeast (Puig et al. 1999; Prescott and Proudfoot 2002) and the observation of the collision of RNA polymerases by atomic force microscopy (Crampton et al. 2006). RNA polymerases bind to the upstream regions of genes and synthesize mRNAs, moving toward the 3′-ends of the genes. When opposite genomic strands in the same locus encode complementary mRNAs like cis-NATs, an RNA polymerase bound on a strand of the genome collides with the RNA polymerase bound on the opposite strand during the transcription of both strands of mRNAs (Figure 4). This leads to the inhibition of transcription. From this model, the frequency of the collisions of RNA polymerases is expected to increase when the overlapping regions increase in length. Moreover, overlapping patterns in the genome would affect the frequency of the collisions of RNA polymerases. In the case of cis-NATs in a head-to-head manner, the 5′-ends of mRNAs are the start position for the transcription of mRNAs. Overlapping at the 5′-end would inhibit the initiation of transcription and decrease the expression level of the cis-NATs (Figure 3A). Contrary to cis-NATs in a head-to-head manner, overlapping at the 3′-end would not decrease the expression level significantly when the overlapping region is short (Figure 3B). In the case of cis-NATs in a full-overlap manner, both the 5′- and 3′-ends of a transcript are overlapped. This would decrease the expression level, even when the length of overlapping regions is short (Figure 3C). For nearby transcripts, highly expressed nearby transcripts in a head-to-head manner decreased in quantity when the distance between the nearby transcripts was <50 bp (Figure 3D). However, the level of highly expressed nearby transcripts in a tail-to-tail manner did not decrease (Figure 3E). These results would occur if the start or end positions of transcripts were close to each other. In the initiation of transcription, RNA polymerases bind to the start position of transcription of mRNAs and cover the region between 55 bp upstream and 20 bp downstream (−55 to 20) of the start position (Korzheva et al. 2000; Lee and Young 2000; Murakami et al. 2002). Therefore, these findings suggest that nearby transcripts in a head-to-head manner inhibited the binding of RNA polymerases to the upstream regions of the transcripts, when the distance between the nearby transcripts in a head-to-head manner was <50 bp. In addition, the model of transcriptional collisions for cis-NATs may explain an observation that experiments of Northern hybridization showed smear bands of mRNAs at the genomic regions where cis-NATs were located (Kiyosawa et al. 2005). This implies that various lengths of single-stranded mRNAs may be produced by the inhibition of the transcription and unusual movements of RNA polymerases.

Among the 2462 groups of human cis-NATs, 874 groups did not include alternative forms of sense and antisense mRNAs, and among the 874 groups, 542 (62%) groups consisted of cis-NATs where a sense mRNA overlapped only the intron regions of the gene encoding the antisense mRNA in the genome. As shown in Figures 1 and and3,3, the expression level of cis-NATs overlapping the intron regions also decreased as the length of the overlapping region increased. As for the regulatory mechanisms of gene expression by cis-NATs overlapping the intron regions, a double-stranded RNA-dependent mechanism would be difficult to use in explaining the decrease of the expression level of the cis-NATs, because they cannot form double strands of mRNAs after transcription and pre-mRNA splicing of mRNAs. Although double strands of mRNAs may be formed before pre-mRNA splicing after transcription, it is unclear whether it occurs. In the meantime, transcriptional collisions reasonably explain that cis-NATs overlapping the intron regions affected the expression of the cis-NATs.

Our observations are consistent with the transcriptional collision model. However, this does not mean that they exclude other regulatory mechanisms from the regulation of cis-NATs in gene expression. In addition to the regulation by general transcription factors in gene expression, cis-NATs employ several regulatory mechanisms, including transcriptional collisions (Lavorgna et al. 2004). Our findings will be useful for the examination and understanding of the regulatory mechanisms of cis-NATs in gene expression and furthermore will help in elucidating the regulatory network of genes and their evolution.

Acknowledgments

We are grateful to members of the Laboratory for DNA Data Analysis at the National Institute of Genetics for discussion and comments on the manuscript. This work was supported by a research grant from the Institute for Bioinformatics Research and Development, Japan Science and Technology Agency, and by a grant of the Genome Network Project from the Ministry of Education, Culture, Sports, Science and Technology, Japan.

References

  • Adachi, N., and M. R. Lieber, 2002. Bidirectional gene organization: a common architectural feature of the human genome. Cell 109: 807–809. [PubMed]
  • Alfano, G., C. Vitiello, C. Caccioppoli, T. Caramico, A. Carola et al., 2005. Natural antisense transcripts associated with genes involved in eye development. Hum. Mol. Genet. 14: 913–923. [PubMed]
  • Carninci, P., T. Kasukawa, S. Katayama, J. Gough, M. C. Frith et al., 2005. The transcriptional landscape of the mammalian genome. Science 309: 1559–1563. [PubMed]
  • Carthew, R. W., 2006. Gene regulation by microRNAs. Curr. Opin. Genet. Dev. 16: 203–208. [PubMed]
  • Castillo-Davis, C. I., S. L. Mekhedov, D. L. Hartl, E. V. Koonin and F. A. Kondrashov, 2002. Selection for short introns in highly expressed genes. Nat. Genet. 31: 415–418. [PubMed]
  • Chen, J., M. Sun, L. D. Hurst, G. G. Carmichael and J. D. Rowley, 2005. a Genome-wide analysis of coordinate expression and evolution of human cis-encoded sense-antisense transcripts. Trends Genet. 21: 326–329. [PubMed]
  • Chen, J., M. Sun, L. D. Hurst, G. G. Carmichael and J. D. Rowley, 2005. b Human antisense genes have unusually short introns: evidence for selection for rapid transcription. Trends Genet. 21: 203–207. [PubMed]
  • Cheng, J., P. Kapranov, J. Drenkow, S. Dike, S. Brubaker et al., 2005. Transcriptional maps of 10 human chromosomes at 5-nucleotide resolution. Science 308: 1149–1154. [PubMed]
  • Coudert, A. E., L. Pibouin, B. Vi-Fane, B. L. Thomas, M. Macdougall et al., 2005. Expression and regulation of the Msx1 natural antisense transcript during development. Nucleic Acids Res. 33: 5208–5218. [PMC free article] [PubMed]
  • Crampton, N., W. A. Bonass, J. Kirkham, C. Rivetti and N. H. Thomson, 2006. Collision events between RNA polymerases in convergent transcription studied by atomic force microscopy. Nucleic Acids Res. 34: 5416–5425. [PMC free article] [PubMed]
  • Encode Project Consortium, 2004. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306: 636–640. [PubMed]
  • Hubbard, T., D. Andrews, M. Caccamo, G. Cameron, Y. Chen et al., 2005. Ensembl 2005. Nucleic Acids Res. 33: D447–D453. [PMC free article] [PubMed]
  • Katayama, S., Y. Tomaru, T. Kasukawa, K. Waki, M. Nakanishi et al., 2005. Antisense transcription in the mammalian transcriptome. Science 309: 1564–1566. [PubMed]
  • Kent, W. J., 2002. BLAT: the BLAST-like alignment tool. Genome Res. 12: 656–664. [PMC free article] [PubMed]
  • Kiyosawa, H., I. Yamanaka, N. Osato, S. Kondo and Y. Hayashizaki, 2003. Antisense transcripts with FANTOM2 clone set and their implications for gene regulation. Genome Res. 13: 1324–1334. [PMC free article] [PubMed]
  • Kiyosawa, H., N. Mise, S. Iwase, Y. Hayashizaki and K. Abe, 2005. Disclosing hidden transcripts: mouse natural sense-antisense transcripts tend to be poly(A) negative and nuclear localized. Genome Res. 15: 463–474. [PMC free article] [PubMed]
  • Korneev, S., and M. O'Shea, 2005. Natural antisense RNAs in the nervous system. Rev. Neurosci. 16: 213–222. [PubMed]
  • Korneev, S. A., J. H. Park and M. O'Shea, 1999. Neuronal expression of neural nitric oxide synthase (nNOS) protein is suppressed by an antisense RNA transcribed from an NOS pseudogene. J. Neurosci. 19: 7711–7720. [PubMed]
  • Korzheva, N., A. Mustaev, M. Kozlov, A. Malhotra, V. Nikiforov et al., 2000. A structural model of transcription elongation. Science 289: 619–625. [PubMed]
  • Koyanagi, K. O., M. Hagiwara, T. Itoh, T. Gojobori and T. Imanishi, 2005. Comparative genomics of bidirectional gene pairs and its implications for the evolution of a transcriptional regulation system. Gene 353: 169–176. [PubMed]
  • Krek, A., D. Grun, M. N. Poy, R. Wolf, L. Rosenberg et al., 2005. Combinatorial microRNA target predictions. Nat. Genet. 37: 495–500. [PubMed]
  • Kumar, M., and G. G. Carmichael, 1997. Nuclear antisense RNA induces extensive adenosine modifications and nuclear retention of target transcripts. Proc. Natl. Acad. Sci. USA 94: 3542–3547. [PMC free article] [PubMed]
  • Kumar, M., and G. G. Carmichael, 1998. Antisense RNA: function and fate of duplex RNA in cells of higher eukaryotes. Microbiol. Mol. Biol. Rev. 62: 1415–1434. [PMC free article] [PubMed]
  • Lai, E. C., 2002. Micro RNAs are complementary to 3′ UTR sequence motifs that mediate negative post-transcriptional regulation. Nat. Genet. 30: 363–364. [PubMed]
  • Lapidot, M., and Y. Pilpel, 2006. Genome-wide natural antisense transcription: coupling its regulation to its different regulatory mechanisms. EMBO Rep. 7: 1216–1222. [PMC free article] [PubMed]
  • Lash, A. E., C. M. Tolstoshev, L. Wagner, G. D. Schuler, R. L. Strausberg et al., 2000. SAGEmap: a public gene expression resource. Genome Res. 10: 1051–1060. [PMC free article] [PubMed]
  • Lavorgna, G., D. Dahary, B. Lehner, R. Sorek, C. M. Sanderson et al., 2004. In search of antisense. Trends Biochem. Sci. 29: 88–94. [PubMed]
  • Lee, J. T., L. S. Davidow and D. Warshawsky, 1999. Tsix, a gene antisense to Xist at the X-inactivation centre. Nat. Genet. 21: 400–404. [PubMed]
  • Lee, T. I., and R. A. Young, 2000. Transcription of eukaryotic protein-coding genes. Annu. Rev. Genet. 34: 77–137. [PubMed]
  • Lehner, B., G. Williams, R. D. Campbell and C. M. Sanderson, 2002. Antisense transcripts in the human genome. Trends Genet. 18: 63–65. [PubMed]
  • Levine, M., and E. H. Davidson, 2005. Gene regulatory networks for development. Proc. Natl. Acad. Sci. USA 102: 4936–4942. [PMC free article] [PubMed]
  • Lewis, B. P., C. B. Burge and D. P. Bartel, 2005. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120: 15–20. [PubMed]
  • Liao, B. Y., and J. Zhang, 2006. Evolutionary conservation of expression profiles between human and mouse orthologous genes. Mol. Biol. Evol. 23: 530–540. [PubMed]
  • Makalowska, I., C. F. Lin and W. Makalowski, 2005. Overlapping genes in vertebrate genomes. Comput. Biol. Chem. 29: 1–12. [PubMed]
  • Margulies, E. H., S. L. Kardia and J. W. Innis, 2001. Identification and prevention of a GC content bias in SAGE libraries. Nucleic Acids Res. 29: E60. [PMC free article] [PubMed]
  • Mattick, J. S., 2001. Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep. 2: 986–991. [PMC free article] [PubMed]
  • Mattick, J. S., 2004. RNA regulation: A new genetics? Nat. Rev. Genet. 5: 316–323. [PubMed]
  • Mattick, J. S., and I. V. Makunin, 2006. Non-coding RNA. Hum. Mol. Genet. 15 (Spec. no. 1): R17–R29. [PubMed]
  • Moore, T., M. Constancia, M. Zubair, B. Bailleul, R. Feil et al., 1997. Multiple imprinted sense and antisense transcripts, differential methylation and tandem repeats in a putative imprinting control region upstream of mouse Igf2. Proc. Natl. Acad. Sci. USA 94: 12509–12514. [PMC free article] [PubMed]
  • Munroe, S. H., and M. A. Lazar, 1991. Inhibition of c-erbA mRNA splicing by a naturally occurring antisense RNA. J. Biol. Chem. 266: 22083–22086. [PubMed]
  • Murakami, K. S., S. Masuda, E. A. Campbell, O. Muzzin and S. A. Darst, 2002. Structural basis of transcription initiation: an RNA polymerase holoenzyme-DNA complex. Science 296: 1285–1290. [PubMed]
  • Pang, K. C., S. Stephen, P. G. Engstrom, K. Tajul-Arifin, W. Chen et al., 2005. RNAdb: a comprehensive mammalian noncoding RNA database. Nucleic Acids Res. 33: D125–D130. [PMC free article] [PubMed]
  • Pearson, W. R., and D. J. Lipman, 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85: 2444–2448. [PMC free article] [PubMed]
  • Peterson, J. A., and A. M. Myers, 1993. Functional analysis of mRNA 3′ end formation signals in the convergent and overlapping transcription units of the S. cerevisiae genes RHO1 and MRP2. Nucleic Acids Res. 21: 5500–5508. [PMC free article] [PubMed]
  • Potter, S. S., and W. W. Branford, 1998. Evolutionary conservation and tissue-specific processing of Hoxa 11 antisense transcripts. Mamm. Genome 9: 799–806. [PubMed]
  • Prescott, E. M., and N. J. Proudfoot, 2002. Transcriptional collision between convergent genes in budding yeast. Proc. Natl. Acad. Sci. USA 99: 8796–8801. [PMC free article] [PubMed]
  • Puig, S., J. E. Perez-Ortin and E. Matallana, 1999. Transcriptional and structural study of a region of two convergent overlapping yeast genes. Curr. Microbiol. 39: 369–373. [PubMed]
  • Reik, W., and J. Walter, 2001. Genomic imprinting: parental influence on the genome. Nat. Rev. Genet. 2: 21–32. [PubMed]
  • Shendure, J., and G. M. Church, 2002. Computational discovery of sense-antisense transcription in the human and mouse genomes. Genome Biol. 3: RESEARCH0044. [PMC free article] [PubMed]
  • Stark, A., J. Brennecke, N. Bushati, R. B. Russell and S. M. Cohen, 2005. Animal MicroRNAs confer robustness to gene expression and have a significant impact on 3′UTR evolution. Cell 123: 1133–1146. [PubMed]
  • Tufarelli, C., J. A. Stanley, D. Garrick, J. A. Sharpe, H. Ayyub et al., 2003. Transcription of antisense RNA leading to gene silencing and methylation as a novel cause of human genetic disease. Nat. Genet. 34: 157–165. [PubMed]
  • Vanhee-Brossollet, C., and C. Vaquero, 1998. Do natural antisense transcripts make sense in eukaryotes? Gene 211: 1–9. [PubMed]
  • Veeramachaneni, V., W. Makalowski, M. Galdzicki, R. Sood and I. Makalowska, 2004. Mammalian overlapping genes: the comparative perspective. Genome Res. 14: 280–286. [PMC free article] [PubMed]
  • Velculescu, V. E., L. Zhang, B. Vogelstein and K. W. Kinzler, 1995. Serial analysis of gene expression. Science 270: 484–487. [PubMed]
  • Wagner, E. G., and K. Flardh, 2002. Antisense RNAs everywhere? Trends Genet. 18: 223–226. [PubMed]
  • Wagner, E. G., and R. W. Simons, 1994. Antisense RNA control in bacteria, phages, and plasmids. Annu. Rev. Microbiol. 48: 713–742. [PubMed]
  • Wutz, A., O. W. Smrzka, N. Schweifer, K. Schellander, E. F. Wagner et al., 1997. Imprinted expression of the Igf2r gene depends on an intronic CpG island. Nature 389: 745–749. [PubMed]
  • Wyrick, J. J., and R. A. Young, 2002. Deciphering gene expression regulatory networks. Curr. Opin. Genet. Dev. 12: 130–136. [PubMed]
  • Yelin, R., D. Dahary, R. Sorek, E. Y. Levanon, O. Goldstein et al., 2003. Widespread occurrence of antisense transcription in the human genome. Nat. Biotechnol. 21: 379–386. [PubMed]

Articles from Genetics are provided here courtesy of Genetics Society of America

Formats:

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...

Links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...