![]() | ![]() |
Formats:
|
||||||||||||||||
Copyright © The Author 2008. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oxfordjournals.org Coexpression of Linked Genes in Mammalian Genomes Is Generally Disadvantageous Department of Ecology and Evolutionary Biology, University of Michigan Corresponding author.E-mail: jianzhi/at/umich.edu. Marta Wayne, Associate Editor Accepted April 19, 2008. This article has been cited by other articles in PMC.Abstract Similarity in gene expression pattern between closely linked genes is known in several eukaryotes. Two models have been proposed to explain the presence of such coexpression patterns. The adaptive model assumes that coexpression is advantageous and is established by relocation of initially unlinked but coexpressed genes, whereas the neutral model asserts that coexpression is a type of leaky expression due to similar expressional environments of linked genes, but is neither advantageous nor detrimental. However, these models are incompatible with several empirical observations. Here, we propose that coexpression of linked genes is a form of transcriptional interference that is disadvantageous to the organism. We show that even distantly linked genes that are tens of megabases away exhibit significant coexpression in the human genome. However, the linkage is more likely to be broken during evolution between genes of high coexpression than those of low coexpression and the breakage of linkage reduces gene coexpression. These results support our hypothesis that coexpression of linked genes in mammalian genomes is generally disadvantageous, implying that many mammalian genes may never reach their optimal expression pattern due to the interference of their genomic environment and that such transcriptional interference may be a force promoting recurrent relocation of genes in the genome. Keywords: gene order, linkage, gene expression, coexpression, evolution, mammals Introduction Nonrandom distribution of genes in a genome, a widespread phenomenon in prokaryotes (Lawrence 1999), has also been observed in various eukaryotes (reviewed in Hurst et al. 2004). In mammals, linked genes sharing similar expression patterns are often referred to as a gene cluster. For example, clusters of highly expressed genes (Caron et al. 2001), tissue-specific genes (Megy et al. 2003; Versteeg et al. 2003), broadly expressed genes (Lercher et al. 2002), and coexpressed genes (Fukuoka et al. 2004; Singer et al. 2005; Semon and Duret 2006) have been observed in the human genome. The general phenomenon of coexpression of linked genes has also been reported in other model eukaryotes such as the yeast Saccharomyces cerevisiae (Coghlan and Wolfe 2000; Kruglyak and Tang 2000; Huynen et al. 2001; Fukuoka et al. 2004; Lercher and Hurst 2006), nematode Caenorhabditis elegans (Lercher et al. 2003; Fukuoka et al. 2004), and fruit fly Drosophila melanogaster (Boutanaev et al. 2002; Spellman and Rubin 2002; Bailey et al. 2004; Fukuoka et al. 2004; Kalmykova et al. 2005). However, it is unclear as to how and why linked genes become coexpressed. The observation that genes involved in the same pathway (Lee and Sonnhammer 2003) or protein complex (Teichmann and Veitia 2004) and genes having similar functions (Cohen et al. 2000) tend to be linked suggests that coexpression of linked genes may be important to gene function (Hurst et al. 2002; Singer et al. 2005). This view, referred to as the adaptive model, assumes that it is beneficial for genes that require coexpression to be brought together via chromosomal rearrangement (Miller et al. 2004; Richards et al. 2005; Singer et al. 2005). The model predicts that once a coexpressed gene cluster is established, the linkage of the coexpressed genes should be evolutionarily maintained by purifying selection (Hurst et al. 2002; Singer et al. 2005). Observations of functional similarity of coexpressed linked genes would support the adaptive model. However, when protein function is defined by Gene Ontology (GO), a study of Drosophila did not find functional similarity among coexpressed neighboring genes (Spellman and Rubin 2002). In humans, clusters of coexpressed linked genes that belong to the same functional category, as defined by GO, are rare (Fukuoka et al. 2004). Furthermore, although the evolutionary conservation of linkage between coexpressed genes in several yeasts supports the adaptive model (Hurst et al. 2002), considering the recent discovery of long-range coregulation (~100 kb, covering ~30 genes) of linked yeast genes (Lercher and Hurst 2006), the adaptive model implies that the gene order in the yeast genome must be highly organized. However, the high plasticity of yeast gene order revealed from a comparison of 11 species (Fischer et al. 2006) argues against this view. In addition, it is well known that chromatin structures control the expression of nearby genes, regardless of whether these genes are functionally related or not (Hurst et al. 2004; Sproul et al. 2005). For instance, the CD79B antigen gene, which is located between the human growth hormone cluster and its locus control region on chromosome 17, is expressed in the pituitary, although its function appears B-cell specific (Cajiao et al. 2004). Thus, it is possible that similar expression of linked genes has no adaptive value. A recent study on mammalian coexpressed linked genes suggested that coexpressed gene clusters are formed by a neutral evolutionary process (Semon and Duret 2006). That is, expression similarity of linked genes is due to transcriptional interference (Eszterhas et al. 2002) and is not necessarily advantageous. Here, transcriptional interference refers to influence of transcription of one gene on the transcription of another gene and can be due to shared cis-regulatory elements or chromatin structures among other things. Our ad hoc use of transcriptional interference is different from a more narrow definition used elsewhere (Shearwin et al. 2005). The neutral model for the formation of coexpressed gene clusters (Semon and Duret 2006) implies that gene expression patterns are not functionally important and thus can change freely during evolution, which is exactly the neutral model of transcriptome evolution (Khaitovich et al. 2004). Although some early studies had favored this neutral model (Khaitovich et al. 2004; Yanai et al. 2004), these studies were later shown to have either technical problems or alternative interpretations (Liao and Zhang 2006a). On the contrary, there is increasing evidence that a considerable fraction of genes in a genome are evolutionarily conserved in expression (Nuzhdin et al. 2004; Denver et al. 2005; Jordan et al. 2005; Khaitovich et al. 2005; Rifkin et al. 2005; Liao and Zhang 2006a; Whitehead and Crawford 2006; Xing et al. 2007). Because coexpression of neighboring genes is a widespread phenomenon (Semon and Duret 2006), it is unlikely that such gene clusters can be formed without any influence on fitness. Hence, neither the adaptive model nor the neutral model can adequately explain the existence of coexpressed gene clusters. Here, we propose that coexpression of linked genes is due to transcriptional interference that is detrimental to the organism. We test our hypothesis in humans, exploiting the availability of a comprehensive spatial gene expression data set (Su et al. 2004). We examined coexpression patterns of closely and distantly linked genes in humans and counted evolutionary losses of gene linkage using multiple mammalian genomes. Lower evolutionary conservation of linkage is found for pairs of genes with high coexpression than those with low coexpression, consistent with the predictions of our hypothesis. Based on these findings, we propose a model of the origin and evolutionary dynamics of coexpression of linked genes. Materials and Methods Genome Data and Annotations The human genome assembly used in the present study is NCBI version 35, in which the position and orthology annotation (to mouse, rat, and dog) of 34,404 known or predicted genes can be found in Ensembl Archive release v37 (http://feb2006.archive.ensembl.org/). Genome annotations were retrieved through BioMart (http://www.biomart.org/). There were several annotated homology relationships between human and other mammalian genes by Ensembl. We only considered homologous gene pairs annotated as unique best reciprocal hit (UBRH, meaning that they were UBRHs in all-against-all BlastZ searches) to be orthologous. By this definition, 10,500 human autosomal genes were found to have unambiguous orthologs in mouse (NCBI v34), rat (RGSC 3.4), and dog (CanFam 1.0) genomes. Analysis of the Microarray Data We obtained the expression information of human genes and mouse genes from the Gene Atlas V2 data set (http://symatlas.gnf.org/SymAtlas/) (Su et al. 2004). This data set comprises oligonucleotide microarray data in 73 human and 61 mouse normal tissues. To assign the expression data from probe sets to corresponding Ensembl genes, probe sequences of each probe set were aligned to the Ensembl cDNA sequences (human: Homo_sapiens.NCBI35.feb.cdna.fa; mouse: Mus_musculus.NCBIM33.feb.cdna.fa; http://www.ensembl.org/info/data/download.html) using BlastN (http://www.ncbi.nlm.nih.gov/blast/). Only those probe sets in which all perfect match probes perfectly matched to the same Ensembl gene were considered to be valid. The expression level detected by each probe set was obtained as the signal intensity (S) computed from MAS 5.0 algorithm (MAS5) (Hubbell et al. 2002). The S values were averaged among replicates. It should be noted that some genes are represented by more than one probe set on the microarray. Because it was not possible to tell which probe set provides the best expression measure of a target gene (Liao and Zhang 2006a), we arbitrarily chose the probe set with highest expression level (Jordan et al. 2005), which was defined by the summation of S across all the examined tissues. As a result, 16,457 human and 16,134 mouse Ensembl genes were assigned with microarray gene expression data. Removal of Duplicate Genes Duplicated genes are expected to have similar expression patterns by ancestry, and such genes, if generated by tandem duplication, are often located in physical proximity to one other. The presence of tandem duplicate genes will artificially generate a negative correlation between the expression similarity of 2 linked genes and the physical distance between them. Furthermore, duplicate genes are subject to the problem of off-target cross-hybridization in gene expression measurement; removing duplicate genes further eliminates the coexpression pattern artificially generated by cross-hybridization. We followed the conventional approach (Lercher et al. 2002, 2003; Singer et al. 2005) to remove this known artifact: First, to identify proteins belonging to the same gene family, an all-against-all BlastP search was performed on the entire protein data set of a genome (for genes having more than one isoforms, the longest peptides were used). To be conservative in the analysis, pairs of proteins with Blast E values <0.2 were considered to be members of the same gene family (Lercher et al. 2002). We then generated a duplicate-free data set by randomly keeping one member of each gene family and removing all other members. Consequently, a subset of 4,857 human autosomal genes that have expression data was retained. By the same approach, a set of 5,384 mouse genes without duplicates was obtained. Some of our analyses require the use of human genes and their orthologs in mouse, rat, and dog genomes. This requirement reduces the number of duplicate-free human genes for the analysis by ~25% (from 4,857 to 3,681). To maintain the statistical power and keep our data set representative of the whole genome, we generated a tandem duplicate–free data set, which contains 7,577 human genes that have expression data and have orthologs in the other 3 mammalian genomes. This data set is larger than the above duplicate-free data set because we now allow duplicate genes that are located on different chromosomes. Expression Profile Similarity between Linked Genes Following Gu et al. (2002), we measured the level of coexpression between 2 linked genes (say A and B) by ln[(1 + R)/(1 − R)], where R is Pearson's correlation coefficient of signal intensity S across all the tissues examined. Higher ln[(1 + R)/(1 − R)] indicates a higher level of coexpression. Using R instead of ln[(1 + R)/(1 − R)] does not change any of our results qualitatively. The chromosomal distance (D) between linked genes was defined by the distance (in nucleotides) between the transcription starting sites of the 2 genes, as annotated by Ensembl.In figure 3 to , except for the first bin, which is with D from 1 × 106 to 2 × 106. Use of other bin sizes did not change our results qualitatively.
Evolutionary Conservation of Linkage When a gene pair is linked in both human and dog genomes, we regard the linkage to be old (or ancestral). Here and elsewhere in this paper, linkage means that 2 genes are located in the same chromosome. Although it is possible that 2 previously unlinked genes became linked in human and dog independently, such events have low probabilities and can be ignored. We analyzed the subset of human gene pairs with old linkages. Within this subset, if the linkage for a gene pair is maintained in both mouse and rat genomes, the linkage is said to be “conserved”; otherwise, it is nonconserved, meaning that the linkage is lost in one or both rodents. Results Coexpression of Distantly Linked Human Genes It is important to first examine whether the phenomenon of coexpression of linked genes exists for both closely and distantly linked genes as such knowledge can help understand the relative importance of different molecular mechanisms responsible for the phenomenon (Hurst et al. 2004). Some studies have attempted to address this question by examining the on/off expressional status of linked genes (Lercher et al. 2002; Semon and Duret 2006), whereas other studies examined the correlation of across-tissue expression profiles of adjacent genes (Hurst et al. 2002; Singer et al. 2005). Adjacent genes are linked genes without any other genes in between. Because chromosomal rearrangements between sex chromosomes and autosomes are rare and sex-linked genes have special functions and expression profiles (Lahn et al. 2001; Wang et al. 2001), here we limit our analyses to autosomal genes. From the 4,857 duplicate-free human autosomal genes (see Materials and Methods), we obtained 4,835 adjacent gene pairs. Let D be the distance in nucleotides between a pair of linked genes in a chromosome. We find a significant correlation between logD and the level of coexpression (ln[(1 + R)/(1 − R)], see Materials and Methods) (Pearson's correlation coefficient r = −0.1385, P < 10−21; Spearman's correlation coefficient ρ = −0.1364, P < 10−20), indicating that closer adjacent human genes have higher similarity in spatial expression profiles. Because microarray data are known to be noisy, to reduce the effect of stochastic background noise, we group linked genes with similar D and calculate average ln[(1 + R)/(1 − R)] for each group. The aforementioned pattern can be seen more clearly with binned data (fig. 1A R)/(1 − R)] (fig. 1B
The power to decipher the effect of linkage on gene coexpression is limited if only adjacent genes are analyzed because there are few adjacent genes with large intervening distances (e.g., 713 adjacent gene pairs with D > 1 Mb and 10 with D > 10 Mb in our data set). We thus analyze pairs of linked genes, without requiring them to be adjacent to each other. From 4,857 duplicate-free human autosomal genes (see above), we obtain 518,133 linked gene pairs with the genomic distances ranging up to 100 Mb. We then group the gene pairs according to their D values (see Materials and Methods) and calculate the average ln[(1 + R)/(1 − R)] for each group. We observe a strong negative correlation between logD and ln[(1 + R)/(1 − R)] (Pearson's r = −0.7121, P < 10−80; Spearman's ρ = −0.6227, P < 10−56; fig. 2 = 0.0198, P = 0.654; Spearman's ρ = −0.0700, P = 0.1122; supplementary fig. S1, Supplementary Material online). Because the correlation observed in the real human genome is computed from the data points with D varying from 10 kb to 100 Mb (fig. 2 + R)/(1 − R)] is significant in nearly every category for the real genome (table 1). To know the chance probability of observing these correlations, we generate 1,000 permutated genomes by randomly swapping gene names of the expression profiles. The chance probability is the frequency of the observed correlations in randomly permutated genomes that are more negative than the correlation observed in the real genome. The result shows that the probabilities are <0.001 in categories <1, 1–5, and 5–25 Mb (table 1), indicating that the phenomenon of coexpression of linked genes extends to a distance of tens of megabases in humans, which can harbor several hundred genes. In addition to D, we also measure the distance between 2 linked genes by the number (N) of intervening genes between them. Consistent with figure 2 + R)/(1 − R)] is significantly negative (supplementary fig. S2, Supplementary Material online), indicating that our observation does not depend on how the distance is measured and that linked genes with >100 intervening genes are still significantly coexpressed.
To examine whether the phenomenon of long-range gene coexpression is universal in mammals, we apply the same method for generating table 1 to the mouse data (see Materials and Methods). Although the correlation between logD and ln[(1 + R)/(1 − R)] is significant when D < 1 and 5–25 Mb, the negative correlations do not exist for the groups of 1–5, 25–50, and 50–100 Mb in mouse (supplementary table S1, Supplementary Material online).Weaker Evolutionary Conservation of Linkage between Genes of Higher Coexpression If the long-range coexpression of linked genes in humans is an outcome of adaptive evolution, the gene order in a large part of the human genome must have been highly organized and evolutionarily preserved. An important test of the hypothesis of functional relevance and adaptive value of coexpression of linked genes is to measure the evolutionary conservation of linkage. If coexpression of linked genes is favored by natural selection, the linkage should be maintained during evolution. If coexpression of linked genes is a neutral phenomenon without functional consequences, no difference in conservation of linkage is expected between gene pairs with high levels of coexpression and those with low levels of coexpression. If coexpression of linked genes is detrimental, the linkage of highly coexpressed genes should be broken more often during evolution than that of poorly coexpressed genes. To test these hypotheses, we utilize the tandem duplicate–free 7,577 human genes that have orthologs in each of the mouse, rat, and dog genomes (see Materials and Methods). Based on the mammalian phylogeny shown in figure 3A + R)/(1 − R)] values of the conservatively linked genes and nonconservatively linked genes within each group. The results show that, for nearly every D range, nonconservatively linked human genes have a higher degree of coexpression than conservatively linked human genes (fig. 3B and COne interesting question is whether the selection against coexpression (or interference) only acts on weakly to moderately coexpressed linked genes but not on strongly coexpressed linked genes. To define strongly coexpressed genes, we plotted the distribution of ln[(1 + R)/(1 − R)] for all 1,521,714 linked gene pairs (from 7,577 tandem duplicate–free genes used in figure 3A–C + R)/(1 − R)] values falling within the top 5% of the distribution (fig. 3DOur transcriptional interference hypothesis predicts that the breakage of linkage between 2 genes would reduce the degree of their coexpression. We examine the difference between the expression profile similarity of human-linked gene pairs and that of their mouse orthologs, by using 26 human–mouse common tissues. The full list of these 26 tissues can be found in a previous study (Liao and Zhang 2006a). Because coexpression of linked genes is much weaker in mouse than in human (supplementary table S1, Supplementary Material online), there is a general trend of reduction in expression profile similarity between a gene pair in mouse compared with that in human (fig. 4
Some authors suggested that reduced recombination can ensure the physical proximity of linked genes (Pal and Hurst 2003) and high recombination tends to disrupt gene linkage (Poyatos and Hurst 2006). Therefore, one expects to observe lower recombination rates between highly coexpressed genes than between poorly coexpressed genes, if coexpression of linked genes is beneficial. However, our analysis of the human genome shows that highly coexpressed linked genes actually have higher recombination rates (cM/Mb) than poorly coexpressed linked genes (supplementary fig. S4, Supplementary Material online). Although recombination rate and chromosomal rearrangement may not be independent from each other (Akhunov et al. 2003; Lindsay et al. 2006), our observation again argues against the adaptive model and neutral model but is consistent with our hypothesis that coexpression of linked genes is detrimental. Discussion There are generally 3 molecular mechanisms that could cause the coexpression of linked genes (Hurst et al. 2004). At the primary level, cis-acting elements directly affect the transcription of neighboring genes (Cho et al. 1998; Kruglyak and Tang 2000). This mechanism will only affect genes within a few kilobases of one another. At the secondary level, histone modifications spread from a locus control region to cosuppress the transcriptional activities of several linked genes until reaching boundary elements (Labrador and Corces 2002). This type of coregulation affects regions of up to a few hundred kilobases. At the tertiary level, transcriptional coregulation can happen in 2 ways. First, genes with certain cis-acting elements can come together to form the node of chromatin loops during transcription; such special formation of aggregated cis-elements is called the active chromatin hub (ACH); genes close to the ACH are accessible to transcription, whereas genes looping out are inaccessible (de Laat and Grosveld 2003). Second, arrangement of chromatin in compact chromosome territories can affect transcription; transcription is largely restricted to territory surfaces but suppressed within the interior (Cremer T and Cremer C 2001). In both of these tertiary-level regulations, effects are expected to range up to several megabases. In the present work, we first report the phenomenon of very long–range (up to tens of megabases) coexpression of linked genes in the human genome. Although this result might suggest the importance of tertiary-level transcriptional regulations in humans, to our knowledge, there is no mechanism that has been demonstrated to regulate coexpression of linked genes at such large distances. Is it possible that our observation is merely an artifact? One potential caveat is the design of the microarray chip that is used to generate the gene expression data. For example, yeast cDNA arrays are designed with the probes printed in genomic order, and it has been suggested that previously observed periodicity of expression patterns of genes located in a chromosome (Cho et al. 1998; Cohen et al. 2000; Kruglyak and Tang 2000) is due to the spatial order of probes on the array (Lercher and Hurst 2006). Because the expression data used here are produced from oligonucleotide microarrays for which the probe positions appear random (Su et al. 2004), the spatial bias occurred in the yeast cDNA array cannot explain our observation. Another possible caveat is the potential unequal levels of coexpression of linked genes on different chromosomes. If the level of coexpression is higher in small chromosomes than in large chromosomes for a given D, the results of figure 2 + R)/(1 − R)] (Pearson's r = −0.7123, P < 10−80; Spearman's ρ = −0.6408, P < 10−60) as in figure 2Contrary to the hypothesis that coexpressed gene clusters correspond to large chromatin domains (Hurst et al. 2002; Roy et al. 2002; Hurst et al. 2004; Sproul et al. 2005), a recent study showed that coexpression of mammalian genes is mainly due to the coregulation of 2 genes by shared promoters (Semon and Duret 2006). Our result favors the hypothesis of gene coregulation by large domains, which is consistent with the discovery in yeast (Lercher and Hurst 2006). Different from our approach Semon and Duret (2006) followed the method used in Lercher et al. (2002) to measure the expression profile similarity of 2 linked genes by calculating how often they are simultaneously “turned on.” One explanation for the inconsistency of our results with that of Semon and Duret (2006) is the fact that transcriptional background only affects the relative gene expression levels across different tissues but not a change of the on/off status of a gene in a particular condition. In such cases, it is more sensitive to measure coexpression of 2 genes by Pearson's correlation coefficient R. Other drawbacks of using the on/off status to measure expression profile similarities from microarray data have been thoroughly discussed in an earlier paper (Liao and Zhang 2006b). Previous investigators have used evolutionary conservation of linkage to study the potential adaptive value of linkage of coexpressed genes, but they did not use outgroups to separate the formation of new linkages from the breakage of old linkages (Hurst et al. 2002; Singer et al. 2005; Semon and Duret 2006; Poyatos and Hurst 2007; Ranz et al. 2007). Hence, if a pair of highly coexpressed genes is observed to be linked in one genome (species A) but not in another (species B), it is often interpreted as a breakage of linkage in species B. In fact, this observation could also be due to the formation of the linkage in species A since the separation of the 2 species. These 2 scenarios cannot be differentiated without the use of an outgroup genome. In the present study, we use the dog as an outgroup to identify those gene pairs that were ancestrally linked in the common ancestor of primates, rodents, and carnivores. We found more interchromosomal rearrangements during rodent evolution for gene pairs with high coexpression in humans than those with low coexpression (fig. 3 Our observations suggest no adaptive value for clustering of coexpressed genes in the human genome in general. Rather, linked genes are coexpressed simply because they share a similar transcriptional background. The existence of large genomic regions with a similar transcriptional background implies that many mammalian genes may never reach their optimal expression profiles because of the interference of the surrounding genomic environment. It should be noted that some authors proposed that the linkage of coexpressed genes may represent lineage-specific transient adaptations (Poyatos and Hurst 2007; Ranz et al. 2007). Although this scenario remains possible, it is extremely hard to test by comparative approaches. Furthermore, this scenario is not contradictory to our finding that coexpression of linked genes is generally deleterious over long-term evolution. Note that we do not suggest that eukaryotic gene order is completely random. Apart from the gene clusters formed by gene duplication or operons (Lercher et al. 2003; Hurst et al. 2004), many clusters of functionally related genes do exist, such as clusters of genes encoding organelle-related proteins (Lefai et al. 2000; Elo et al. 2003; Alexeyenko et al. 2006) and genes encoding proteins in the same protein complex (Teichmann and Veitia 2004). However, it should be noted that some of these clusters actually do not show high degree of gene coexpression (Alexeyenko et al. 2006). Together with our finding, it is clear that the phenomenon of coexpression and similar function of linked genes should be considered separately. A recent study showed that gene expression profile corresponds poorly to gene function (Yanai et al. 2006). Apparently, there are factors other than gene function that determine a gene's expression. Because evolutionary changes of gene expression may play a more significant role than changes of protein sequence in phenotypic evolution (King and Wilson 1975; Carroll 2005), identifying such factors is of fundamental importance to our understanding of evolution. Our result implies that a change in gene location can facilitate expression evolution, which is similar to what was previously known as the positional effect (Festenstein et al. 1996; Milot et al. 1996; Kleinjan and van Heyningen 1998). Our hypothesis that coexpression of linked genes is detrimental raises an important question. That is, if such coexpression is deleterious, how can it be fixed in the first place? Here, we propose a model to explain this seemingly dilemmatic phenomenon. We propose that although coexpression of linked genes is generally detrimental, the “mutation” that generates coexpression as a by-product may initially be advantageous. Figure 5
Chromosomal rearrangement is just one way to remove the transcriptional interference of linked genes (fig. 3 Conclusions Our observations presented in the present study are consistent with neither the adaptive nor the neutral model. The results support our hypothesis that coexpression of linked genes in the human genome is a form of deleterious transcriptional interference. Because all genes are located in the neighborhood of other genes, such interference may be mechanistically inevitable. As a consequence, the expression profile of a gene may never be optimized in evolution. Rather, transcriptional interference may be the source creating instability and dynamics of the mammalian gene order. In light of this finding, it will be of great interest to identify those few genes that are tightly linked across a large number of mammals or vertebrates as such exceptional incidences of conserved linkage (e.g., Hox clusters) likely indicate gene coregulations that are beneficial to the organisms. Supplementary figures S1–S8 and table S1 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/). [Supplementary Data]
Acknowledgments We thank Xionglei He, Wendy Grus, Ondrej Podlaha, Zhi Wang, and Patricia Wittkopp for valuable comments. This work was supported by research grants from University of Michigan Center for Computational Medicine and Biology and National Institutes of Health to J.Z. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||
Curr Opin Genet Dev. 1999 Dec; 9(6):642-8.
[Curr Opin Genet Dev. 1999]Nat Rev Genet. 2004 Apr; 5(4):299-310.
[Nat Rev Genet. 2004]Science. 2001 Feb 16; 291(5507):1289-92.
[Science. 2001]Genome Biol. 2003; 4(2):P1.
[Genome Biol. 2003]Genome Res. 2003 Sep; 13(9):1998-2004.
[Genome Res. 2003]Genome Res. 2003 May; 13(5):875-82.
[Genome Res. 2003]Genetics. 2004 Aug; 167(4):2121-5.
[Genetics. 2004]Nat Genet. 2000 Oct; 26(2):183-6.
[Nat Genet. 2000]Trends Genet. 2002 Dec; 18(12):604-6.
[Trends Genet. 2002]Mol Biol Evol. 2005 Mar; 22(3):767-75.
[Mol Biol Evol. 2005]J Biol. 2002; 1(1):5.
[J Biol. 2002]BMC Genomics. 2004 Jan 13; 5(1):4.
[BMC Genomics. 2004]Trends Genet. 2002 Dec; 18(12):604-6.
[Trends Genet. 2002]J Mol Biol. 2006 Jun 9; 359(3):825-31.
[J Mol Biol. 2006]PLoS Genet. 2006 Mar; 2(3):e32.
[PLoS Genet. 2006]Mol Biol Evol. 2006 Sep; 23(9):1715-23.
[Mol Biol Evol. 2006]Mol Cell Biol. 2002 Jan; 22(2):469-79.
[Mol Cell Biol. 2002]Trends Genet. 2005 Jun; 21(6):339-45.
[Trends Genet. 2005]OMICS. 2004 Spring; 8(1):15-24.
[OMICS. 2004]Mol Biol Evol. 2006 Mar; 23(3):530-40.
[Mol Biol Evol. 2006]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Bioinformatics. 2002 Dec; 18(12):1585-92.
[Bioinformatics. 2002]Mol Biol Evol. 2006 Mar; 23(3):530-40.
[Mol Biol Evol. 2006]Gene. 2005 Jan 17; 345(1):119-26.
[Gene. 2005]Nat Genet. 2002 Jun; 31(2):180-3.
[Nat Genet. 2002]Genome Res. 2003 Feb; 13(2):238-43.
[Genome Res. 2003]Mol Biol Evol. 2005 Mar; 22(3):767-75.
[Mol Biol Evol. 2005]Nat Rev Genet. 2004 Apr; 5(4):299-310.
[Nat Rev Genet. 2004]Nat Genet. 2002 Jun; 31(2):180-3.
[Nat Genet. 2002]Mol Biol Evol. 2006 Sep; 23(9):1715-23.
[Mol Biol Evol. 2006]Trends Genet. 2002 Dec; 18(12):604-6.
[Trends Genet. 2002]Mol Biol Evol. 2005 Mar; 22(3):767-75.
[Mol Biol Evol. 2005]Proc Natl Acad Sci U S A. 2003 Feb 4; 100(3):1056-61.
[Proc Natl Acad Sci U S A. 2003]Trends Genet. 2004 Dec; 20(12):631-9.
[Trends Genet. 2004]PLoS Biol. 2006 Apr; 4(4):e91.
[PLoS Biol. 2006]Proc Natl Acad Sci U S A. 2006 Jun 27; 103(26):9929-34.
[Proc Natl Acad Sci U S A. 2006]PLoS Comput Biol. 2007 Jan 5; 3(1):e2.
[PLoS Comput Biol. 2007]Nat Genet. 2003 Mar; 33(3):392-5.
[Nat Genet. 2003]Trends Genet. 2006 Aug; 22(8):420-3.
[Trends Genet. 2006]Proc Natl Acad Sci U S A. 2003 Sep 16; 100(19):10836-41.
[Proc Natl Acad Sci U S A. 2003]Am J Hum Genet. 2006 Nov; 79(5):890-902.
[Am J Hum Genet. 2006]Nat Rev Genet. 2004 Apr; 5(4):299-310.
[Nat Rev Genet. 2004]Mol Cell. 1998 Jul; 2(1):65-73.
[Mol Cell. 1998]Trends Genet. 2000 Mar; 16(3):109-11.
[Trends Genet. 2000]Cell. 2002 Oct 18; 111(2):151-4.
[Cell. 2002]Chromosome Res. 2003; 11(5):447-59.
[Chromosome Res. 2003]Mol Cell. 1998 Jul; 2(1):65-73.
[Mol Cell. 1998]Nat Genet. 2000 Oct; 26(2):183-6.
[Nat Genet. 2000]Trends Genet. 2000 Mar; 16(3):109-11.
[Trends Genet. 2000]J Mol Biol. 2006 Jun 9; 359(3):825-31.
[J Mol Biol. 2006]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Trends Genet. 2002 Dec; 18(12):604-6.
[Trends Genet. 2002]Nature. 2002 Aug 29; 418(6901):975-9.
[Nature. 2002]Nat Rev Genet. 2004 Apr; 5(4):299-310.
[Nat Rev Genet. 2004]Nat Rev Genet. 2005 Oct; 6(10):775-81.
[Nat Rev Genet. 2005]Mol Biol Evol. 2006 Sep; 23(9):1715-23.
[Mol Biol Evol. 2006]Trends Genet. 2002 Dec; 18(12):604-6.
[Trends Genet. 2002]Mol Biol Evol. 2005 Mar; 22(3):767-75.
[Mol Biol Evol. 2005]Mol Biol Evol. 2006 Sep; 23(9):1715-23.
[Mol Biol Evol. 2006]Genome Biol. 2007; 8(11):R233.
[Genome Biol. 2007]PLoS Biol. 2007 Jun; 5(6):e152.
[PLoS Biol. 2007]Genome Biol. 2007; 8(11):R233.
[Genome Biol. 2007]PLoS Biol. 2007 Jun; 5(6):e152.
[PLoS Biol. 2007]Genome Res. 2003 Feb; 13(2):238-43.
[Genome Res. 2003]Nat Rev Genet. 2004 Apr; 5(4):299-310.
[Nat Rev Genet. 2004]Insect Mol Biol. 2000 Jun; 9(3):315-22.
[Insect Mol Biol. 2000]Plant Cell. 2003 Jul; 15(7):1619-31.
[Plant Cell. 2003]Trends Genet. 2006 Nov; 22(11):589-93.
[Trends Genet. 2006]Mol Biol Evol. 2006 Jun; 23(6):1136-43.
[Mol Biol Evol. 2006]Science. 2001 Jan 19; 291(5503):447-50.
[Science. 2001]Mol Biol Evol. 2005 Mar; 22(3):767-75.
[Mol Biol Evol. 2005]Nature. 2002 Dec 5; 420(6915):520-62.
[Nature. 2002]Genome Res. 2004 Apr; 14(4):507-16.
[Genome Res. 2004]