![]() | ![]() |
Formats:
|
||||||||||||||
Copyright © 2006, EMBO and Nature Publishing Group Communication between levels of transcriptional control improves robustness and adaptivity 1Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA 2Department of Systems Biology, Harvard Medical School, Boston, MA, USA 3Laboratory of Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA, USA aDepartment of Systems Biology, Harvard Medical School, 200 Longwood Ave., WAB 536 Boston, MA 02115, USA. Tel: +1 617 432 6401; Fax: +1 617 432 5012; E-mail: pamela_silver/at/hms.harvard.edu *Present address: Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14260, USA †Present address: Department of Biochemistry, Stanford University, Stanford, CA 94305, USA Received June 14, 2006; Accepted September 18, 2006. This article has been cited by other articles in PMC.Abstract Regulation of eukaryotic gene expression depends on groups of related proteins acting at the levels of chromatin organization, transcriptional initiation, RNA processing, and nuclear transport. However, a unified understanding of how these different levels of transcriptional control interact has been lacking. Here, we combine genome-wide protein–DNA binding data from multiple sources to infer the connections between functional groups of regulators in Saccharomyces cerevisiae. Our resulting transcriptional network uncovers novel biological relationships; supporting experiments confirm new associations between actively transcribed genes and Sir2 and Esc1, two proteins normally linked to silencing chromatin. Analysis of the regulatory network also reveals an elegant architecture for transcriptional control. Using communication theory, we show that most protein regulators prefer to form modules within their functional class, whereas essential proteins maintain the sparse connections between different classes. Moreover, we provide evidence that communication between different regulatory groups improves the robustness and adaptivity of the cell. Keywords: ChIP-chip, network adaptivity, network robustness, nuclear organization, transcriptional network Introduction The nucleus provides several mechanisms for regulating gene expression at the levels of chromatin organization, transcriptional initiation, RNA processing, and selective export via the nuclear pore complex. Groups of proteins that mediate these processes have been extensively characterized to provide insight into their mode of action within a living cell. For example, chromatin-immunoprecipitation experiments in combination with microarrays (termed ChIP-chip) have mapped the genomic occupancy of several protein classes in living cells. Genome-wide identification of binding sites has allowed for the inference of which genes are regulated by such factors. A number of previous studies have used genomic localization data from transcription factors (TFs) in order to build transcriptional regulatory networks in Saccharomyces cerevisiae (Lee et al, 2002; Bar-Joseph et al, 2003; Garten et al, 2005; Balaji et al, 2006a, 2006b). Other work has implicated histone modifying proteins and nucleosome remodelers (NRs) in regulating different gene expression programs (Ng et al, 2002; Robyr et al, 2002; Robert et al, 2004). However, a unified model that integrates the genome-wide interplay of all of these different protein regulators remains undefined. Achieving such a model is hindered by several technical difficulties. For example, different labs devoted to the study of particular classes of proteins often use disparate microarray technologies and statistical approaches to decide what constitutes a bona fide binding site. Ideally, combining genome-wide binding data from different labs would not only uncover new connections within specific fields of study, such as cooperativity among TFs, but also between diverse fields, such as the effect of NRs on TF recruitment. Moreover, a unified model could allow for a global, systems-level description of the eukaryotic transcriptional architecture. Here, we combine and normalize ChIP-chip data from multiple sources to gain a unified view of the interplay between functional groups of proteins in the budding yeast S. cerevisiae. We propose that these functional groups define discrete levels of the eukaryotic transcriptional architecture (Figure 1A
Results Building a transcriptional network We obtained genome-wide binding data for 317 regulators during normal, glucose-rich growth conditions. As different groups use microarrays that are comprised of open reading frames (ORFs) or intergenic regions (IGRs), we integrated the heterogeneous data by assigning each ChIP-chip measurement to its pertinent annotated gene (see Materials and methods). Further, we normalized the data to mitigate variability in statistical analyses used by different groups (Figure 1B Using the normalized data, we inferred biological connections between any two regulators by identifying significant binding relationships (e.g., TF X and HM Y bind to a significantly similar/dissimilar number of genes). Using communication theory, we developed two methods for identifying pairwise connections between regulators. First, we introduced a simple but powerful technique of filtered correlation coefficient. This analysis localizes the correlation calculation to the most relevant genes in the ChIP-chip data and detects linear binding relationships with great sensitivity (see Materials and methods). In order to uncover more general, nonlinear binding dependencies, we also measured the mutual information—the amount of information gained about the binding profile of one factor from knowledge of the binding tendency of another factor (see Materials and methods). Mutual information is a very natural and biologically meaningful measure of binding dependence that will ultimately help us decide whether two proteins participate in the same biological process. Finally, we combined the P-values from the two complementary pairwise approaches in order to increase the confidence of our overall predictions (Figure 1C To identify significant binding relationships between three or more regulators, we developed a semi-supervised clustering algorithm that preserves information about elements of a cluster to better capture groupwise binding dependencies between proteins (see Materials and methods). Our algorithm identified 35 highly significant clusters (P<10−60), merging factors from different levels of the transcriptional architecture. Of these clusters, 26 were confirmed by published literature (Supplementary Table 2), thereby indicating that using ChIP-chip data in this manner was a viable way to infer biological relationships between different proteins. Based on our pairwise and groupwise statistical methodologies, we created a network algorithm (Figure 1C and D
Validation of network Many of our predicted binding relationships that occur near DNA represent previously reported protein–protein associations. We compared our network predictions to protein–protein interactions from several high-throughput and small-scale studies (Yu et al, 2004a). Yu et al reported a compendium of physical interactions between all yeast proteins, including 309 interactions between proteins considered in our work. Although protein–protein associations may form anywhere within a cell, our predicted binding relationships occur only near DNA; hence, we did not expect full overlap with the data set of Yu et al (2004a). Despite the noise in protein–protein experiments, 100 out of the 309 pertinent connections detected by previous studies were also found to be significant by our method (P<10−60; Figure 3A
Our network algorithm also identified over 340 biological relationships confirmed by published literature (Supplementary Table 3). For example, Figure 2 New biological predictions Our resulting network uncovered a novel connection between proteins implicated in active gene expression and silent information regulator Sir2 (Figure 2B Our observations of Sir2's occupancy at active genes suggest a coupling between nuclear transport factors and silencing proteins such as Sir2, which colocalizes with Rap1 to the nuclear periphery (Gotta et al, 1997). To explore this connection further, we experimentally measured the genome-wide binding of Esc1, another silencing protein known to localize to the nuclear periphery (Supplementary Table 4). The genomic occupancy of Esc1 closely resembles that of Sir2, including binding at subtelomeric regions and a significant number of active genes (Figure 3D We validated our genome-wide localization experiments using previous literature and new experiments. Lieb et al (2001) previously performed genome-wide localization analysis for Sir2 in glucose medium. Figure 4A
Network analysis We next analyzed the topology of the network to quantify the interplay between different regulatory levels as defined in Figure 1A
Network robustness Communication between regulatory levels improved the robustness of the eukaryotic transcriptional network. For comparison to the overall network, we synthesized six subnetworks composed solely of interactions between nodes from a single level. In each subnetwork, one largest connected component (the largest set of nodes that are interconnected through some path) emerged; however, the overall network connected 33 more regulators than the sum of the individual subnetwork LCs (Figure 5A In our overall transcriptional network, proteins preferred to form modular subunits within their own level and communicate with other regulatory groups in a more selective manner. We defined inter- and intra-class affinity as the percentage of interactions realized between regulators from the same and from different classes, respectively. Each regulatory group exhibited a much higher intra- than inter-class affinity, indicative of each group's inherent modularity (Figure 5A Modularity within levels helps localize the deleterious effect of a dysfunctional regulator to its level. The flat lines in Figure 5B and C Essential proteins comprised a significant proportion of the hubs that link levels of the transcriptional architecture. Excluding histone modifications, the network consisted of 56 essential and 230 non-essential proteins (Winzeler et al, 1999). Essential proteins were more highly connected than non-essential nodes, with an average degree of 16 versus 11, respectively. We determined the number of regulatory groups each node linked, or its neighboring levels. We found that 73% of the essential proteins linked two or more levels and 50% connected four or more levels, compared to 42 and 14% of non-essential nodes, respectively (P<3 × 10−5, P<4 × 10−8). For example, the essential TF Rap1 has been implicated in the recruitment of regulators from several levels to active genes (Lieb et al, 2001; Bernstein et al, 2004; Casolari et al, 2004). Network adaptivity Regulators that preferentially bound to either active or inactive genes had opposing topological characteristics that significantly differed from the rest of the nodes in our network. We defined a factor as active/inactive if it was bound to a significant fraction (P<10−10) of the 20% most/least frequently transcribed genes (Holstege et al, 1998). The 31 active factors in our network displayed a high average degree (P<10−5) and low characteristic path length and diameter (P<10−5), indicative of fast propagation of information between regulators of a highly responsive, global process (Figure 5A These results suggest a model whereby increased communication between levels at active genes may improve the adaptivity and redundancy of the cell's response to changing conditions (Figure 5D Discussion By combining genome-wide binding data, we have defined a transcriptional architecture for S. cerevisiae. Our normalization of ChIP-chip data extracted more information at known interactions while attenuating the noise at unlikely interactions. Moreover, we applied aspects of communication theory to identify the connections between different regulatory levels of the transcriptional network. In the process, we introduced mutual information, filtered correlation, and semi-supervised clustering approaches for analyzing genome-wide binding data. Previous literature confirmed the validity of our method. Further, our integrative network approach accurately predicted novel biological phenomena, including unexpected connections between actively transcribed genes and silencing proteins Sir2 and Esc1. We validated these associations using ChIP-chip and quantitative PCR experiments. Hence, our network predictions represent an in silico screen for discovering new biological processes. We analyzed the topology of the network to quantify the communication between levels of the transcriptional architecture. Our work formally showed that TFs and HMs associated in more localized, pathway-specific regulation, whereas NTs, RPs, and NRs controlled more responsive, global processes. We also found that the overall network had higher connectivity than level subnetworks, making it more robust to single and sequential in silico deletions. Further, regulatory levels exhibited high intra-class modularity, which localizes the effect of deletions to each level. Essential proteins often form the connections between the highly clustered levels. Moreover, increased communication between levels expedites the propagation of information at active genes and may improve the speed and redundancy of the cell's response to dynamic environmental conditions. Taken together, communication between levels of transcriptional control improves the robustness and adaptivity of the eukaryotic cell. Several recent papers have focused on characterizing the regulatory effect of a single class of protein regulators, TFs. Combining ChIP-chip and other data for TFs, the authors discovered combinatorial relationships that control specific gene expression programs in S. cerevisiae. For example, co-occurrence of TF DNA-binding motifs and the distance between motifs at bound promoter sequences was used to predict interacting TF pairs. Presence of DNA-binding motifs, coexpression, and TF partnerships were also used to reduce false binding sites in ChIP-chip experiments and to predict TF occupancy in untested environmental conditions (Garten et al, 2005; Beyer et al, 2006). In two separate studies, Hwang et al (2005a, 2005b) developed a general method for integrating numerous types of data that provide evidence for TF regulation and used it to accurately reconstruct the regulatory network for galactose utilization in yeast. Finally, Balaji et al (2006a, 2006b) combined genomic localization data from over a hundred TFs and revealed an ‘over-engineered' distributed network architecture for TF co-regulation in yeast. In this work, we found interactions not only between TFs but also between all the proposed levels of transcriptional control. Hence, unlike previous studies, we unified ChIP-chip data from several different laboratories that use varying microarray technologies and statistical platforms for measuring protein–DNA binding. Our integrative method standardized all ChIP-chip data to P-values that measure confidence of protein–DNA interactions in a uniform manner. We next developed a method for comparing the normalized data in a biologically meaningful manner by combining P-values from two complementary techniques—filtered correlation and mutual information. Mutual information makes a hard decision on classifying the data as 0's and 1's before the analysis, filtering out likely false positives. In contrast, the filtered correlation cost function considers a soft and continuous version of the data, capturing many likely false negatives omitted by the mutual information analysis. Combining P-values integrates evidence from both analyses, thereby causing fewer errors in deciding whether two factors have a significant binding dependence. Our work shows that integrating ChIP-chip data can provide new insight into the eukaryotic transcriptional architecture as a whole while also predicting novel interactions between individual components. As more genome-wide localization data sets become available, we believe that the statistical methodology presented here can be extended to mammalian cells. Moreover, we expect that future time-dependent ChIP-chip experiments from different developmental stages will allow for a dynamic description of the transcriptional architecture in complex organisms. Materials and methods Data integration We obtained published genome-wide binding (ChIP-chip) data for TFs (Lieb et al, 2001; Wyrick et al, 2001; Lee et al, 2002; Bar-Joseph et al, 2003; Harbison et al, 2004; Kurdistani et al, 2004), NTs (Casolari et al, 2004; Hieronymus et al, 2004; Yu et al, 2004b), RPs (Geisberg and Struhl, 2004; Hieronymus et al, 2004; Kim and Iyer, 2004; Moqtaderi and Struhl, 2004; Yu et al, 2004b), NRs (Damelin et al, 2002; Ng et al, 2002; Santos-Rosa et al, 2003; Gelbart et al, 2005), HMs (Lieb et al, 2001; Robyr et al, 2002; Wang et al, 2002; Ng et al, 2003; Robert et al, 2004), and HSs (Bernstein et al, 2002, 2004; Kurdistani et al, 2004). Factors were placed into groups according to each protein's primary annotated function. For ORF microarray data, we mapped the ChIP-chip information at each ORF to its corresponding annotated gene. For intergenic microarray data, where each intergenic region can control zero, one, or two genes, we assigned each DNA probe to the gene that it most likely regulates using a many-to-many mapping. This algorithm uses the union of intergenic probe–gene assignment pairs from several different groups (Ren et al, 2000; Lieb et al, 2001; Simon et al, 2001; Wyrick et al, 2001; Damelin et al, 2002; Ng et al, 2003; Bernstein et al, 2004; Geisberg and Struhl, 2004; Harbison et al, 2004; Moqtaderi and Struhl, 2004). Moreover, when two or more intergenic fragments mapped to the same gene, the probe that contains the most amount of information was chosen. As ChIP-chip experiments contain more information at the tails of the binding distribution, we chose the most-bound fragment for multiple probes that were consistently bound and the least-bound fragment for multiple probes that were consistently not bound. Data normalization To normalize the ChIP-chip data sets, we used P-values as a source of binding information (Supplementary Table 1). Most data sets calculated P-values based on the single array error model (Ren et al, 2000). To make data sets from our group consistent, we converted the P-values from two-sided to one-sided. To find the missing P-values for the remaining quarter of the data sets, we estimated the mean and variance of the log binding ratio distribution of the unbound population of genes using the left side of the overall log binding ratio distribution. Based on the estimate of the unbound distribution, we assigned a P-value for each observed binding ratio (see Supplementary information). To facilitate dissemination of our results and to stimulate further research, we have included our unified data sets and our Matlab code used for the analysis (Supplementary Table 6, Supplementary file MatlabCode.zip). Filtered correlation and mutual information To calculate the filtered correlation coefficient between normalized data vectors x and y for two proteins, ρ(x,y), we used maximum likelihood estimators to find the means of x and y across all genes, and the filtered variance and covariance for x and y across only genes bound by either protein. We found ρ(x,y) using these quantities and used the Student's t-test statistic to assign P-values for ρ(x,y) (see Supplementary information). To estimate mutual information, we discretized x and y to binary data vectors of bound (1) and unbound (0) gene–protein interactions, by choosing a threshold that maximizes the information at 356 known interactions (Supplementary Table 7). Using the discrete data, we estimated the marginal and joint distribution for the binary (Bernoulli) binding profiles X and Y of any two proteins and found their mutual information I(X;Y) as follows: ![]() Next, we computed the P-values for I(X;Y) estimates using a hypergeometric test statistic. Finally, we used Fisher's method to combine the P-values from the two complementary pairwise approaches and obtained an overall P-value (PVT) for the pairwise dependence between two proteins (see Supplementary information). Semi-supervised clustering Unlike hierarchical clustering, our semi-supervised clustering algorithm maintains information about the groupwise relationship between elements in each cluster (see Supplementary information). The algorithm keeps track of two groupwise information vectors—the average binding profile xk of all joined proteins in each cluster Ck and the fraction of factors that occupy each gene fk—and uses these vectors to calculate the pairwise P-values (PVT) between partitions. At the start, the algorithm treats each of N elements (protein binding vectors) as a cluster and proceeds for N−1 iterations. At each iteration, the algorithm joins the two most similar partitions, based on the smallest pairwise P-value distance ![]() until all N elements are unified into one partition. When merging two clusters Ck and Cl into cluster Co, the algorithm updates the groupwise information vectors ![]() where Ck represents the size (cardinality) of cluster Ck. To identify only highly significant clusters, we use a P-value threshold of 10−60, more stringent than the pairwise threshold of 10−40 for connecting nodes in the network.Hypergeometric P-values To find the probability that k or more elements intersected subsets of n and m members at random (or the P-value for overlap of k) in a superset of size N, we summed over the right tail of a hypergeometric distribution: ![]() We used this method to measure the significance of overlaps between essential proteins and hubs, for the network validation in Figure 3 ChIP-chip and quantitative PCR ChIP-chip experiments for Sir2 and Esc1 were performed essentially as described (Casolari et al, 2004). All the experiments were performed in biological triplicates. Immunoprecipitation was performed as described previously (Casolari et al, 2004). ChIPs were performed in biological duplicates as previously described (Lei and Silver, 2002) with the modification of using Dynal beads instead of Sepharose beads during immunoprecipitation. For immunoprecipitations, monoclonal anti-Myc (9E11, Santa Cruz) antibody was pre-coupled to pan-mouse IgG Dynal beads (Dynal Co.) followed by extensive washing. Immunoblotting was performed to confirm consistent protein levels and immunoprecipitation efficiency in each experiment. For quantitative PCR, primer sets spanning predicted novel associated genes were used. The results were compared against signals from an intergenic region to determine the magnitude of enrichment. The sequences of primers used were as follows: PGK1, GGACTTGAAGGACAAGCGTGTC and GCAATTCCTTAGCAACTGGAGCC; ILV5, AGATTGATCTGCAACTCCCGTG and ACCTTGGGAACCGTAACCGATC; RPA34, CGAGTTCAGCATACCAGATGG and CATTATCCTTGGCAGTGCTAGC; TEF2, CGGTCATGTCGATTCTGGTAAG and TCTCTGTGACCTGGAGCATC; intergenic region, GAAAAAGTGGGATTCTGCCTGTGG and GTTTGCCACAGCGACAGAAGTATAACC. Network analysis Single in silico deletion for each protein regulator (i.e., no deletion of HSs) involved removing the protein's node and all links connected to it in both the overall network and the pertinent subnetwork. For each in silico deletion in both perturbed networks, we calculated the number of resulting disconnected nodes from the same/different level as the removed node. Sequential attacks against TFs involved removing nodes in a sequential and cumulative manner, starting with the most highly connected TF and proceeding in a descending order (Albert et al, 2000). Choosing the order of sequential deletions at random did not affect the overall conclusions. To find P-values for a measured topology of a class of m regulators (e.g., active factors), we repeated the network analysis for m randomly selected regulators in 105 independent trials. We counted the number of times, n, the same or more significant network topology occurred and assigned a P-value of n/105. All the network analysis results remained consistent after incorporating corrections for level size and negative links (Supplementary Figure 3). Supplementary Methods Click here to view.(204K, pdf) Supplementary Tables and References Click here to view.(102K, pdf) Supplementary Figure 1 Click here to view.(7.5M, pdf) Supplementary Figure 2 Click here to view.(197K, pdf) Supplementary Figure 3 Click here to view.(692K, pdf) Supplementary Table 4 Click here to view.(999K, xls) Supplementary Table 5 Click here to view.(334K, xls) Supplementary Tables 1, 2, 3, 7, and 8 Click here to view.(86K, xls) MATLAB code Click here to view.(348K, zip) Supplementary Table 6 Click here to view.(45M, zip) Acknowledgments We thank David Gifford, Tommi Jaakkola, Manolis Kellis, Robin Dowell, Sourav Dey, and Obrad Scepanovic for their keen insight. We thank Jessica Hurt, Natalie Farny, and Jake Wintermute for critical reading of the manuscript. This work was supported by NDSEG and NSF fellowships to AMT, Ryan scholarship to CRB, NIH post-doctoral fellowship to MCY, Charles Stark Draper Endowment to MZW, and NIH grants to PAS. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Science. 2002 Oct 25; 298(5594):799-804.
[Science. 2002]Nat Biotechnol. 2003 Nov; 21(11):1337-42.
[Nat Biotechnol. 2003]Nucleic Acids Res. 2005; 33(2):605-15.
[Nucleic Acids Res. 2005]J Mol Biol. 2006 Jun 30; 360(1):213-27.
[J Mol Biol. 2006]J Mol Biol. 2006 Jun 30; 360(1):204-12.
[J Mol Biol. 2006]Science. 2000 Dec 22; 290(5500):2306-9.
[Science. 2000]Trends Genet. 2004 Jun; 20(6):227-31.
[Trends Genet. 2004]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Science. 2001 Dec 14; 294(5550):2357-60.
[Science. 2001]Genes Dev. 2002 Apr 1; 16(7):806-19.
[Genes Dev. 2002]Cell. 2004 May 14; 117(4):427-39.
[Cell. 2004]Mol Cell Biol. 2004 Sep; 24(18):8104-12.
[Mol Cell Biol. 2004]Cell. 1998 Nov 25; 95(5):717-28.
[Cell. 1998]Mol Biol Cell. 1998 Dec; 9(12):3273-97.
[Mol Biol Cell. 1998]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]EMBO J. 1997 Jun 2; 16(11):3243-55.
[EMBO J. 1997]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Nat Rev Genet. 2004 Feb; 5(2):101-13.
[Nat Rev Genet. 2004]Nature. 2004 Sep 16; 431(7006):308-12.
[Nature. 2004]Nat Rev Genet. 2004 Feb; 5(2):101-13.
[Nat Rev Genet. 2004]Science. 1999 Aug 6; 285(5429):901-6.
[Science. 1999]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Genome Biol. 2004; 5(9):R62.
[Genome Biol. 2004]Cell. 2004 May 14; 117(4):427-39.
[Cell. 2004]Cell. 1998 Nov 25; 95(5):717-28.
[Cell. 1998]Nucleic Acids Res. 2005; 33(2):605-15.
[Nucleic Acids Res. 2005]PLoS Comput Biol. 2006 Jun 16; 2(6):e70.
[PLoS Comput Biol. 2006]Proc Natl Acad Sci U S A. 2005 Nov 29; 102(48):17296-301.
[Proc Natl Acad Sci U S A. 2005]Proc Natl Acad Sci U S A. 2005 Nov 29; 102(48):17302-7.
[Proc Natl Acad Sci U S A. 2005]J Mol Biol. 2006 Jun 30; 360(1):213-27.
[J Mol Biol. 2006]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Science. 2001 Dec 14; 294(5550):2357-60.
[Science. 2001]Science. 2002 Oct 25; 298(5594):799-804.
[Science. 2002]Nat Biotechnol. 2003 Nov; 21(11):1337-42.
[Nat Biotechnol. 2003]Nature. 2004 Sep 2; 431(7004):99-104.
[Nature. 2004]Science. 2000 Dec 22; 290(5500):2306-9.
[Science. 2000]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Cell. 2001 Sep 21; 106(6):697-708.
[Cell. 2001]Science. 2001 Dec 14; 294(5550):2357-60.
[Science. 2001]Mol Cell. 2002 Mar; 9(3):563-73.
[Mol Cell. 2002]Science. 2000 Dec 22; 290(5500):2306-9.
[Science. 2000]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Cell. 2004 May 14; 117(4):427-39.
[Cell. 2004]Genes Dev. 2002 Nov 1; 16(21):2761-6.
[Genes Dev. 2002]Nature. 2000 Jul 27; 406(6794):378-82.
[Nature. 2000]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Cell. 1998 Nov 25; 95(5):717-28.
[Cell. 1998]Nat Genet. 2001 Aug; 28(4):327-34.
[Nat Genet. 2001]Cell. 2004 May 14; 117(4):427-39.
[Cell. 2004]Mol Cell. 2004 Oct 22; 16(2):199-209.
[Mol Cell. 2004]