![]() | ![]() |
Formats:
|
||||||||||||||||
Copyright © 2006, Cold Spring Harbor Laboratory Press A molecular-properties-based approach to understanding PDZ domain proteins and PDZ ligands 1 Massachusetts General Hospital, Center for Computational and Integrative Biology, Harvard University Medical School, Boston, Massachusetts 02114, USA; 2 Massachusetts General Hospital, Gastrointestinal Unit, Harvard University Medical School, Boston, Massachusetts 02114, USA; 3 Broad Institute of MIT and Harvard University, Cambridge, Massachusetts 02139, USA; 4 Cardiovascular Division, Department of Medicine, Brigham and Women’s Hospital, Harvard University Medical School, Boston, Massachusetts 02115, USA 5Corresponding author. E-mail Xavier/at/molbio.mgh.harvard.edu; fax (617) 643-3328. Received March 6, 2006; Accepted May 8, 2006. This article has been cited by other articles in PMC.Abstract PDZ domain-containing proteins and their interaction partners are mutated in numerous human diseases and function in complexes regulating epithelial polarity, ion channels, cochlear hair cell development, vesicular sorting, and neuronal synaptic communication. Among several properties of a collection of documented PDZ domain–ligand interactions, we discovered embedded in a large-scale expression data set the existence of a significant level of co-regulation between PDZ domain-encoding genes and these ligands. From this observation, we show how integration of expression data, a comparative genomics catalog of 899 mammalian genes with conserved PDZ-binding motifs, phylogenetic analysis, and literature mining can be utilized to infer PDZ complexes. Using molecular studies we map novel interaction partners for the PDZ proteins DLG1 and CARD11. These results provide insight into the diverse roles of PDZ–ligand complexes in cellular signaling and provide a computational framework for the genome-wide evaluation of PDZ complexes. The 90-amino-acid PDZ domain found in Caenorhabditis elegans, Drosophila melanogaster, and mammalian genomes takes its name from the first three PDZ-containing proteins identified: the post-synaptic density protein PSD-95/SAP90, the Drosophila tumor suppressor and septate junction protein Discs-large, and the mammalian epithelial tight junction protein zona-occludins-1 (ZO-1) (Kennedy 1995). The structural features of PDZ domains permit them to mediate specific protein–protein interactions, which assemble large protein complexes involved in polarity, vesicle transport, phototransduction, ion channel signaling, and synaptic signaling (Sheng and Sala 2001; Nourry et al. 2003; van Ham and Hendriks 2003; Macara 2004). A single PDZ protein may participate in different aspects of cell polarization, suggesting that developmental timing, cellular context, and multiple binding partners are critical regulators of its multidimensional usage (Betschinger et al. 2003; Betschinger and Knoblich 2004). The importance of understanding PDZ proteins is underscored by the fact that disrupting or deregulating PDZ domain-containing proteins or their ligands results in >20 human Mendelian diseases, while mutational screens suggest that PDZ proteins such as DLG1 may be critical in epithelial tumorgenesis (Bilder 2004; Fuja et al. 2004; Wang et al. 2004; Stephens et al. 2005). PDZ domains bind to proteins via several mechanisms, the most common of which is the binding of PDZ domains to three classes of consensus carboxy-terminal binding motifs, although in a limited number of cases binding of PDZ domains to internal sites has been described (Songyang et al. 1997; Nourry et al. 2003; Penkert et al. 2004). Within a PDZ protein itself, the affinity of a particular PDZ domain for its corresponding ligand can be coupled to the engagement of protein partners located at neighboring PDZ or other domains, supporting complex temporal and hierarchical control of PDZ complexes in vivo (Penkert et al. 2004; Peterson et al. 2004). To generate a resource to study the interaction between PDZ proteins and PDZ ligands, we sought to integrate the protein recognition code of PDZ domains with publicly available genomic data sets. Motivated by our observation that 96% of PDZ-binding motifs were conserved across three mammalian species in a collection of literature-curated PDZ–ligand interactions, we systematically discovered a genome-wide set of 899 genes encoding classical PDZ-binding motifs conserved across these three species (the PDZ Conserved Binding Motif proteome, or PDZCBM). Uniquely, we also considered the possibility that embedded in expression profiles exists the specific enrichment in co-expression between the set of genes encoding a particular domain and that set encoding for the respective cognate binding motif(s). Thus, we tested and found connectivity at the level of mRNA, reflected by co-regulation between PDZ domain proteins and PDZ ligands. As a result, we provide an integrated view of PDZ and the PDZCBM with respect to co-expression patterns, cellular localization, interologs, and literature co-citation profiles to enable the prediction of known and novel PDZ complexes. Results To gain insights into PDZ-mediated biological processes, we developed a schema outlined in Figure Figure11
Expression analysis of the human PDZ gene complement Next, using gene expression data from 79 human tissues, we examined the tissue/cell expression distribution of the PDZ domain-encoding genes, since, for many genes, detailed expression patterns had not been previously reported (Su et al. 2004). As shown in Figure Figure2A,2A
Characterization of PDZ–ligand interactions via a published literature survey PDZ-binding motifs generally fall into three established motif classes, based on the residues at positions −2 and 0 with respect to the carboxyl amino acid of a protein (Sheng and Sala 2001; Nourry et al. 2003). In order to standardize definitions, we refer to a carboxy-terminal motif as “canonical” if the carboxy-terminal residues matched those shown in Figure Figure1.1. We next assessed the level of conservation of the PDZ-binding motif for each ligand in human, mouse, and rat via a reciprocal best-hit strategy. Among the 137 ligands harboring canonical carboxy-terminal motifs, the PDZ-binding motif was conserved in 95.6% (131/137) of the cases. In the remaining 4.4% (6/137) of cases, the ligand contained a canonical PDZ-binding motif, but only from the species in which the interaction was originally identified. Furthermore, among the 34 ligands containing non-canonical motifs, the PDZ motif was conserved in all cases. Taken together, these data would suggest a bias for bona fide interactions to occur via conserved carboxyl motifs. Systematic discovery of carboxy-terminal motifs identifies PDZ ligand motifs The functional importance of the carboxy-terminal motif in PDZ domain recognition and the high level of conservation observed among verified ligands led us to investigate whether the carboxy-terminal PDZ-binding motif was actually common or rare as compared with other conserved carboxy-terminal tail motifs in mammalian genomes. To address this question in a systematic fashion, we started with the collection of 22,737 human reference mRNAs from the RefSeq database (downloaded from UCSC Genome Browser, 04/2005 freeze) and, for these, constructed genome-wide alignments among three other mammalian species (mouse, rat, dog) as described previously (Xie et al. 2005). Ultimately, 13,913 mRNAs mapping to 11,044 unique genes were capable of being aligned among human, mouse, dog, and rat based on the requirement that upon translation, the stop codons for all four species occurred at the identical amino acid position. Since the three canonical PDZ-binding motifs have twofold degeneracy at the 0 and −2 position, we examined the number of conserved instances of all possible twofold degenerate motifs at these positions in our database of aligned reference sequences. Theoretically, there are 204 or 160,000 possible consensus motifs in this search space (i.e., two amino acid positions with twofold degeneracy), and we observed that there was at least one conserved instance of 44,099 out of the possible 160,000 motifs in our collection of sequences. For each of the 44,099 motifs, we counted the number of conserved instances, thereby generating a rank-ordered list of conserved twofold degenerate carboxy-terminal motifs. We called the carboxyl terminus conserved if sequences at positions −2 and 0 of all four species satisfied the consensus motif. In addition, we counted the frequency of all 399 observed conserved non-degenerate motifs at positions −2 and 0. Strikingly, the Class I motif was ranked number one among all twofold generate motifs possible, with (S/T) X (L/V) having 391 conserved instances. Furthermore, we observed that Class I tails occupied four out of the five top-ranked non-generate conserved motifs. These data imply that PDZ-binding motif Class I is likely to be the most strongly conserved carboxy-terminal binding motif in mammalian species (Fig. (Fig.2C2C Discovery of a genome-wide set of 899 potential PDZ ligands Non-canonical motifs can interact with PDZ domains; however, the majority of interactions identified in our examination of the published PDZ–ligand interactions occurred via conserved canonical PDZ-binding motifs. We therefore chose to focus on this subset in order to identify a more complete set of potential PDZ ligands. Specifically, we searched the international protein index (IPI) database for proteins with carboxy-terminal sequences that contained one of the three canonical PDZ-binding consensus motifs and for which rat or mouse orthologous proteins also contained the motif. This approach identified 505, 165, and 229 genes that encode proteins containing conserved Class I, II, and III PDZ-binding consensus motifs, respectively. This will be referred to as the PDZ Conserved Binding Motif (PDZCBM) gene set (n = 899) (Fig. (Fig.1;1 The PDZCBM gene set: Cellular localization, functional annotation, and tissue distribution Next, characterization of the PDZCBM gene set in parallel with the known literature ligands was undertaken, analyzing binding motif distributions, cellular localization, and functional annotation using the Gene Ontology classification system. Such an analysis would allow us to evaluate the extent to which our comparative genomics strategy recovered a set of genes that followed the characteristics of the literature PDZ ligands. Like those ligands previously reported in the literature, the majority of the PDZCBM gene set was found to have Class I motifs (56% and 72%, respectively), with a similar number of Class II motifs (21% vs. 18%). Class III genes were not compared directly, with only five ligands of this class reported in the literature to date (Itoh et al. 1999; Mancini et al. 2000; Jelen et al. 2003). Structurally, two-thirds (66%) of Class I literature ligands are predicted by the TMHMM algorithm to harbor transmembrane domains, along with 48% of the proteins encoded by the PDZCBM gene set. The number of predicted transmembrane domains among Class II motif genes in the literature versus PDZCBM sets was 19% and 43%, respectively. Aggregating all motif classes, both gene sets are significantly enriched in membrane localization in comparison with the human proteome based on Gene Ontology compartment annotation (GO: Integral to Membrane P-value = 8.64 × 10−29 [literature] vs. 4.12 × 10−23 [PDZCBM]; Bonferroni-corrected Fischer’s exact test; Fig. Fig.3A3A
To examine in an unbiased fashion the extent to which the literature and PDZCBM gene sets shared similar functions, GO biological process criteria were applied. We show in Figure Figure3B3B Finally, we assessed the expression pattern of the PDZCBM gene set across the GNF tissue compendium. Interestingly, modules of strong neuronal and immune expression were observed for the PDZ genes themselves, suggesting sub-networks of PDZ–ligand genes having specific function in these tissues (Fig. (Fig.3C).3C Protein domains of PDZCBM: Enrichment in RhoGEFs and RhoGAPs The protein domain composition of the PDZCBM was next examined to identify domains that functionally cooperate with PDZ proteins (see Supplemental Table 3 for annotation of PFAM domains for each PDZCBM gene). This analysis was performed by comparing the distribution of PFAM domains found in the PDZCBM with that of the human proteome using a one-tailed Fischer’s exact test (cumulative hyper-geometric probability distribution) to calculate the P-values. For the 12.1% (116/899) of the PDZCBM proteome not characterized by a PFAM domain, BLASTp analysis was performed to the non-redundant (nr) proteome database at NCBI, revealing homology with known proteins in some instances. The PDZCBM was enriched (P < 0.05) in several PFAM domains such as ion transport, immunoglobulin, and PH domains, many of which are typically found in proteins known to participate in PDZ-mediated processes and in the literature set of ligands (Supplemental Table 7). Interestingly, there was also a set of domains that was enriched in the PDZCBM but not in the literature subset, suggesting that the larger data set may yield additional insights into the functionality and mechanism of PDZ complexes (Fig. (Fig.3E).3E Gene family analysis of proteins encoding conserved PDZ-binding motifs The previously reported literature suggests that PDZ proteins often interact with multiple members of a gene family to execute their functions, as in the case of membrane conductance (e.g., Ca2+ and K+ ion channels), neuronal synaptic communication (e.g., glutamate receptors), and epithelial adhesion (e.g., claudins) (Balda and Matter 2000; Kim and Sheng 2004). Conversely, the presence of multiple gene family members with conserved PDZ-binding motifs may suggest that such a family functions as PDZ ligands. We define a gene family as a group of genes that share significant sequence similarity with common domain architectures. We therefore examined the extent to which novel ligands in the PDZCBM belonged to known PDZ-binding families. To this end, multiple protein sequence alignment was performed on the PDZCBM proteome using CLUSTALW and phylogenetic trees derived by neighbor-joining analysis. Interestingly, 48% (435/899) of the PDZCBM genes fall into 163 gene families with two or more members, with 49% (80/163) having at least one family member as a known PDZ ligand based on literature curation and/or co-citation with term PDZ (see Supplemental Table 3 for full gene family annotation). As a result of phylogenetic analysis, ligands were categorized into (1) those occurring in a family, but where perhaps the extent of potential PDZ-binding ligands in a family is less appreciated; (2) published gene families, but where no member had been suggested or shown in vivo to be involved in PDZ complexes to our knowledge; and (3) novel gene families with members containing PDZ-binding motifs or (4) ligands not falling into gene families. As an example of the first category, the nectin-like immunoglobulin gene family contains five members, with four out of five harboring conserved Class II PDZ-binding motifs. Two of the four nectin-like proteins (IGSF4 and IGSF4D) with motifs have been shown to bind PDZ proteins (Fig. (Fig.4A).4A
As an example of the third category, novel gene families were discovered based on the observation of phylogenetically related proteins among the PDZCBM, as was the case for XKR4, XKR6, and XKR7. Although these proteins do not contain recognizable PFAM domains, all three share homology with several additional mammalian proteins, CG32579 of D. melongaster, and ced-8 in C. elegans. In fact, XKR6 is the ortholog of ced-8 (BLASTp 6 × 10−26), and iterative searching of databases revealed an extended gene family with nine members in total, including the gene (XK) responsible for X-linked McLeod Neuroacanthocytosis Syndrome (Ho et al. 1994; Supplemental Fig. 2). The function and mechanism of action of XK and ced-8 are unclear, but both have characteristics of membrane transport proteins, and ced-8 appears to regulate the timing of apoptosis (Stanfield and Horvitz 2000). As a subfamily of a larger ced-8–like gene family, XKR4, XKR6, and XKR7 suggest PDZ complexes may be linked to execution of ced apoptotic pathways. Mapping the PDZCBM gene set to Drosophila: Phenotypic profiling and interologs Several recent studies have demonstrated the value of large-scale protein interaction maps and phenotypic screens in model organisms to understand complex cellular processes and identify human disease genes. In some instances, studies have sought to identify conserved orthologous interactions, which are referred to as “interologs.” Nevertheless, the transfer of interaction information from model organisms to mammalians systems, although powerful, is imperfect (Bork et al. 2004; Ramani et al. 2005). In this regard, we propose that information transfer is likely to be more robust if the two proteins observed to interact in model organisms had conserved interaction surfaces in mammalian species, as might be observed between a protein with a PDZ domain interacting with a protein harboring a conserved PDZ-binding motif. Thus, we sought to further annotate the PDZCBM gene set members and their potential involvement in PDZ protein complexes by analyzing the orthologous set of proteins in Drosophila at the sequence, interaction, and functional level. We chose D. melanogaster since many functions of PDZ complexes were originally identified in this model organism, coupled with the availability of sequence, large-scale yeast-two-hybrid maps, and phenotypic screens (RNAi or mutagenesis). To accomplish our goal we first identified fly orthologs of the PDZCBM by reciprocal best-hit strategy (E-cutoff 10−10). We found Drosophila orthologs for 30% (149/505), 28% (46/165), and 32% (74/229) of Class I, II, and III genes, respectively, with 34.5% (93/269) having a conserved mammalian PDZ-binding motif (see Supplemental Table 3 for list of fly orthologs). For all 269 fly orthologs, RNAi or mutant phenotypes were assigned based on manual curation of GO biological process annotation, review of individual articles, and a large-scale RNAi screen performed to examine cell morphology and create annotation profiles (Kiger et al. 2003) (see Supplemental Table 4 for complete annotation profiles). In those Drosophila orthologs with mammalian motifs and phenotypic profiles available, a higher percentage are associated with polarity, adhesion, ion transport, or neuronal synaptic processes (manifested by mutants causing abnormal bristle polarity, myoblast fusion, and dorsal closure, for example) than not: 43% (40/93) compared with 26% (24/93), respectively. These results highlight that, in some instances, known PDZ ligands important in the regulation of polarity and adhesion have evolutionarily conserved PDZ-binding motifs, as is the case for the mutants Van Gogh (VANGL1), yurt (EPB4IL5), and rolling pebbles (TANC2) (Wolff and Rubin 1998; Menon and Chia 2001; Rau et al. 2001; Hoover and Bryant 2002). We also identified novel proteins in the PDZCBM set with highly conserved PDZ-binding motifs, including but not limited to CG31534 (DKFZp434I0312), CG7323 (PLEKHG5), and CG56987 (RASSF8, previously known as C12orf2) (Supplemental Table S1). Based on a Drosophila yeast-two-hybrid (Y2H) interaction map and interolog concepts, CG7323 (PLEKHG5) interacts with PDZ protein l(2)02045 (GIPC1), while CG56987 (C12orf2) is capable of interacting with multiple PDZ-domain-containing proteins, such as MINT (APBA2) (Fig. (Fig.4F;4F PDZ genes are co-expressed with ligands For PDZ complexes to form and accomplish their biological functions, their components must be temporally and spatially co-localized. Some of this control should be contributed at the level of gene expression. Thus, we sought to determine whether co-expression patterns could be used as a tool to help predict the connectivity map between PDZ proteins and their ligands. To test this hypothesis, using the set of 270 interactions we identified in reviewing the current literature (see above), we examined whether the known PDZ proteins and their ligands were co-expressed more than expected by chance alone. In order to assess the level of co-regulation between PDZ and ligand gene sets, we applied the previously described nearest neighborhood analysis (NNA) methodology to a large-scale, publicly available, mRNA expression atlas of human samples (79 normal or transformed tissues/cells; 16,684 genes, 117 PDZ genes) (Mootha et al. 2003; Owen et al. 2003). We found an enrichment of known ligands in PDZ neighborhoods, as compared with equally sized sets of randomly selected genes (permutation testing, P < 0.001). Specifically, we observed 31 PDZ proteins that have ≥10 literature ligands in the top 250-gene neighborhood. Examining 1000 random ligand sets equal in size to literature ligands, we observe an average of 0.3 (stddev 0.22, maximum of three) PDZ proteins having ≥10 random ligands in the 250-gene neighborhood (Fig. (Fig.5A).5A
Although certainly many of the PDZ proteins and their ligands are constitutively expressed and/or controlled at the level of translation or post-translational modification, we were interested in determining the fraction of known PDZ–ligand interactions that can be detected using co-expression patterns as measured by individual expression neighborhood indexes (see Methods). Examining individual PDZ gene expression neighborhoods, we detected 14% and 21% of the published PDZ–ligand interaction pairs in the 250- and 500-gene neighborhood radius, respectively. Given that there was significant co-expression of the known PDZ and ligand gene sets, it was of interest to be able to assess the chance any specific PDZ and potential ligand pair interacted based on co-regulation. The challenge of examining any given PDZ expression neighborhood for potential ligands is to know what correlation threshold will be the most sensitive and specific for identifying true ligands. One approach may be to consider only those ligands in the nearest neighbors of a PDZ in the radius of 250 or 500 based on the above gene set results. Another approach, however, is to examine the distribution of Pearson’s pairwise correlation values between known PDZ–ligand interactions compared with that between PDZ and random sets of genes. Based on this analysis, we determined a pairwise Pearson’s correlation cutoff (PCC) of ρ > 0.40, since it corresponds to <5% of the random distribution (P-value < 0.05) (Fig. (Fig.5C).5C Identification of novel PDZ complexes Given the ability of correlation thresholds and NNA to capture known PDZ complexes, we next sought to test novel predictions based on NNA/PCC, the PDZCBM comparative genomics catalog, cellular localization, and literature parsing. The DLG family of PDZ proteins can interact with overlapping and different sets of PDZ ligands (Montgomery et al. 2004; Supplemental Fig. Fig.1).1 Figure Figure6A6A
Coupling expression patterns to co-citation profiles To gain further insights into the relationship of expression patterns between PDZ genes and ligands, we performed hierarchical clustering of pairwise correlation values between PDZ proteins and ligands (PDZCBM) in the GNF tissue compendium. Figure Figure5D5D Recent biochemical and genetic studies have demonstrated a central role for CARD11 as a positive regulator of antigen receptor signaling (Egawa et al. 2003; Hara et al. 2003; Jun et al. 2003; Newton and Dixit 2003). In this case, we applied the AND rule of logic operators to predict candidate CARD11 interactions with members of the PDZCBM, requiring that a candidate ligand be co-expressed at ρ > 0.40 AND have literature citations supporting a positive role in antigen receptor signal transduction. As a result, six out of 899 PDZCBM genes (RAC2, SH2D3C, FYN, SCAP2, TBC1D10A, and PKCα) emerged as potential CARD11 interactors (Liu et al. 1998; Marie-Cardine et al. 1998; Black et al. 2000; Reczek and Bretscher 2001; Yu et al. 2001; Itoh et al. 2002; Sakakibara et al. 2003; Sugie et al. 2004; Matsumoto et al. 2005; Rahmouni et al. 2005). Among the six candidate ligands, previous studies have demonstrated that SH2D3C over-expression enhances IL2 production. As shown in Figure Figure6D,6D Discussion In this study, we present a strategy for the discovery of novel PDZ complexes based on the integration of co-expression, comparative genomics, and citation profiles. As an integral part of this strategy, the identification and expression patterns of the human PDZ domain-containing gene complement as well as that of 899 genes that encode for proteins with conserved PDZ-binding motifs are reported. Using publicly available tissue expression profiles, we show that a distinct subset of PDZ genes and ligands are preferentially expressed in the immune system, a number comparable to the subset of PDZ genes/PDZ ligands enriched in the nervous system. Inspection of the tissue compendium revealed that PDZ genes such as RAPGEF6 and paralogs of evolutionarily conserved polarity components such as APBA2 and LIN7C are prominently expressed in lymphocytes. (Fig. (Fig.1;1 Similar to the PDZ gene expression catalog, the PDZCBM catalog serves as a resource. The significance of the conservation rate of the PDZ-binding motif(s) was evaluated by conducting a systematic interrogation of all possible twofold carboxy-terminal degenerate motifs in four mammalian species. Importantly, the Class I motif is the most conserved carboxy-terminal motif in the four species examined. Further, sequence alignments and domain annotation suggested that PDZ ligands (PDZCBM) fall into gene families and, when coupled with information derived from literature parsing, interolog data and co-expression enables placement in biological pathways. In terms of integrative strategies, co-expression profiles have previously been employed to infer functional or physical interactions. However, few experimentally verified interactions have been documented prospectively, especially in mammalian systems. In addition, it has not been reported whether embedded in such profiles exists the specific enrichment in co-expression between the set of genes encoding a particular domain and that set encoding for the respective cognate binding motif(s). Using the GNF expression atlas of human tissues, we demonstrated enrichment of known PDZ ligand genes in the expression neighborhoods of PDZ domain-encoding genes. In support of this observation, a retrospective mining of co-expression patterns between PDZ domains and known ligands enabled the detection of 21% of 239 PDZ–ligand interactions published over the past 10 yr. This level of sensitivity between co-expression and physical interaction is consistent with that seen in a systems view of the C. elegans interactome (Li et al. 2004). Prospectively, and as proof of principal, the co-expression patterns in this data set suggested a novel interaction between DLG1 and BCR, which was then experimentally verified. DLG1 is a tumor suppressor involved in the regulation of cell cycle progression through not entirely clear mechanisms, but it interacts with proteins such as PBK (a mitotic Ser/Thr kinase) and APC (Matsumine et al. 1996; Gaudet et al. 2000). Here, we demonstrated that DLG1 interacts and co-localizes with BCR at the mitotic spindle. Recent proteomic surveys have also reported that PDZ ligands such as ACTN4, HAPIP, and NADRIN as well as potential PDZ ligands by our analysis (e.g., CGI-23) are localized to the midbody (Skop et al. 2004). These observations suggest that PDZ proteins, mirroring the role of PDZ complexes in other synaptic processes, may regulate the synaptic connection between cells at cytokinesis. Our approach is complementary to other approaches such as yeast two-hybrid screens (Y2H) and proteomic surveys, since no one approach will identify all known PDZ protein complexes. This is most directly demonstrated by the modest level of overlap between identified protein complexes based on mass spectrometry or Y2H screens in interaction maps published to date (Uetz et al. 2000; Ito et al. 2001; Stanyon et al. 2004). One strength of correlative profiling compared with Y2H screens lies in that our methodology extracts information from the status of cells/tissues in their endogenous state as opposed to the environment of yeast cells. On the other hand, mass spectrometry-based approaches certainly account for post-translational modifications, which are not reflected in co-expression patterns. Extending our approach to the analysis of gene expression data sets derived from mucosal epithelial surfaces, solid tumors, and embryonic tissues may provide additional insights into co-expression patterns of PDZ genes with PDZ ligands. In addition, the current PDZCBM catalog is likely to under-represent the number of potential PDZ ligands. An expanded PDZCBM should be discovered by designing algorithms to detect conserved PDZ-binding motifs in splice variants of genes and by utilizing the entire spectrum of observed literature PDZ-binding motifs to broaden the comparative genomics searches. Lastly, the impact on expression and the composition of PDZ complexes by microRNAs provides an important avenue for future exploration, given that 42 of the PDZ-encoding genes and numerous ligands are predicted to be miRNA targets (John et al. 2004). In summary, we have developed a systematic computational platform, based on the integrative analysis of the biological properties of PDZ domain-encoding genes and ligands, to facilitate further understanding of the PDZ complexes in health and disease. Methods Identification of PDZ domain-containing proteins A reference catalog of genes that encode PDZ domain-containing proteins was obtained by searching the SMART ([http://smart.embl-heidelberg.de/] [IPR001478]) and SCOP superfamily (release 1.65 [http://supfam.org/SUPERFAMILY/cgi-bin/scop.cgi?ipid=SSF50156]) databases. These databases recorded 479 and 145 PDZ-encoding proteins, respectively, in the human genome. The protein-encoding sequence and gene IDs were obtained and mapped to a LocusLink (currently known as Entrez gene) or UniGene identifiers to reveal a non-redundant set of 136 genes encoding proteins with PDZ domains (Supplemental Table 1). In addition, the protein sequences of five PDZ domain proteins were used to perform a TBLASTN search of ESTs identifying no further PDZ-encoding genes. For each of the 136 genes, the reference protein sequence (for those with multiple isoforms, the longest isoform) was used to search against the conserved domain database (CDD; http://www.ncbi.nih.gov/Structure/cdd/cdd.shtm) to identify the number of PDZ domains (E-cutoff 10−5) encoded by a particular gene. These 136 genes were found to encode 237 unique PDZ domains, based on the longest reference sequence. Literature PDZ–ligand interactions In order to curate a set of ligands confirmed by experimental data in the literature, we queried the PubMed database using PDZ as keyword and identified 1262 articles from the period of January 1995 to November 2004. Based on review of the abstract and then of the text when appropriate, 525 publications regarding PDZ proteins in mammalian systems (human, rat, mouse) were analyzed. These articles reported on a total set of 270 unique binary interactions involving 185 ligands, where the experimental evidence demonstrated that the interactions were via a PDZ domain and/or the carboxyl terminus of the ligand (see Supplemental Table 2 for the list of interactions). Our recorded set of interactions is equivalent to the set of non-redundant mammalian PDZ–ligand interactions in PDZbase (as of July 2005), which was published after the initiation of our work (Beuming et al. 2005). In our data set, there were fewer interactions than publications reviewed, since some publications reported identical interactions as others, experimental evidence was lacking that interaction was mediated by PDZ domain and/or cytoplasmic tail, or reported interactions mapped to a non-PDZ domain, although we recorded a small subset of the latter (n = 7). Further examination of these 185 carboxy-terminal non-redundant ligands revealed that 145 were interacting via canonical PDZ carboxy-terminal binding motifs: Class I (X-[S/T]-X-[L/V]-COOH); Class II (X-Y-X-V; X-F-X-A; X-Y-X-I; X-V-X-I; X-V-X-V; X-I-X-V-COOH); and Class III (X-[D/E]-X-[L/V]-COOH). Motifs were defined as “canonical” based on the reviews of Sheng and Sala (2001) and Nourry et al. (2003). For the 270 interactions, we identified probe sets that met our filtering criteria (see below) in the current data set for both the PDZ and ligand in 88% (239/270) of interactions. All references for the 270 interactions are available on request. PDZ ligand identification A stand-alone Perl script was written such that for each human protein (reference sequence or predicted) in the International Protein Database (IPI) (June 2004 build), the carboxy-terminal 10 amino acids were extracted and checked for the presence of consensus Class I, Class II, or Class III PDZ-binding motifs described above. To identify redundant IPI protein entries with consensus positive motifs encoded by the same gene, pairwise BLASTp analysis utilizing the entire protein sequence coupled with assignment of each protein unique identifiers including UniGene and/or LocusLink ID (Entrez gene ID) was performed. Next, we identified the subset of genes that had a reference protein sequence (either NP or XP accessions) confirming that the carboxyl terminus indeed matched a consensus PDZ-binding motif. If no reference protein sequence was available for a given database entry (e.g., for some UniGene entries), we manually examined all available evidence such as gene predictions, EST alignments, and comparative genomics to ascertain if a gene model or a cDNA sequence encoded for a carboxy-terminal PDZ-binding motif. We employed a combination of NCBI functions BLASTp, TBLASTN, and/or BLINK analysis at the proteome level and genomic alignments via BLAT functions of the UCSC Genome Browser to find mouse/rat orthologs and to crosscheck human gene models. This level of stringency was taken since in some instances the protein sequence in the IPI database represented in actuality a partial fragment, which by chance had a consensus motif, but the full-length protein did not have a motif. Next, the HomolGene and InParanoid databases were used to identify the reciprocal best hit in the mouse and rat proteome (E-value < 10−10). If no HomolGene or InParanoid entry was available, manual curation was performed as above identifying the reciprocal best mouse and/rat sequence by BLASTp analysis of a non-redundant protein database (NCBI). Perfect conservation of the consensus motif in either the rat or mouse ortholog was required to be included in subsequent analyses. The above criteria and filtering lead to the identification of a total of 899 proteins with conserved PDZ-binding domains in their carboxy-terminal ends (505 Class I, 165 Class II, and 229 Class III genes; see Supplemental Table 3). Carboxy-terminal motif search We started with 22,737 reference mRNA sequences from the RefSeq database (downloaded from UCSC Genome Browser, 04/2005 freeze). Upon translation, we were able to align 13,913 of these RefSeqs to orthologous mouse/dog/rat reference sequences based on the requirement that all four species had aligned stop codons. The alignments were extracted from nucleotide whole-genome alignments between human/mouse/rat/dog generated by the UCSC Genome Browser and translated to amino acid sequences as described previously (Xie et al. 2005). If a RefSeq accession, when translated, had a different stop codon in human versus mouse, the RefSeq was not used in the analysis. In other words, if a stop codon in human was amino acid 555 and mouse was 554, these were not used for further analysis. Further mapping revealed that these 13,913 reference sequences mapped to 11,044 unique genes based on Entrez/LocusLink IDs. To ascertain the level of conservation of the known consensus PDZ-binding motifs (I, II, and III) that exhibit twofold degeneracy at positions −2 and 0 (e.g., class I: X-[S/T]-X-[L/V]-COOH), we conducted an unbiased search for all possible twofold degenerate carboxy-terminal binding motifs at positions −2 and 0 in the 11,044 aligned reference sequences. In theory, there are 160,000 possible consensus motifs in a twofold degenerate motif search space. Of the 160,000 theoretically possible consensus motifs, we observed 44,099 of such motifs upon examination of the translated reference sequences among all species in our collection of 11,044 aligned genes. We called a site “conserved” if the sequences at positions −2 and 0 of all four species (human/dog/mouse/rat) satisfied one of the possible 44,099 motifs. Subsequently, we counted the total number of genes with conserved instances for each of these 44,099 twofold degenerate motifs. As a result, we have generated a ranked ordered list, with the most conserved twofold generate motif at positions −2 and 0 to the least conserved. We observed that the PDZ Class I motif ranked number one among twofold degenerate motifs with the largest number of conserved instances. In addition, we derived counts for all 399 conserved non-degenerate motifs at positions −2 and 0 with the 4/5 top-ranked motifs corresponding to individual Class I motifs. Domain, GO, co-citations, and phylogenetic analysis of ligands Transmembrane domains were predicted by the TMHMM algorithm (http://www.cbs.dtu.dk/services/TMHMM/) (Krogh et al. 2001). We used the published DAVID 2.0 program to compute enrichments of both the transmembrane and GO biological processes by using Fischer’s exact probability with Bonferroni corrections, which identifies functional categories over-represented in a gene list relative to the representation within the proteome of a given species (http://david.niaid.nih.gov/david/version2/index.htm) (Dennis Jr. et al. 2003; Hosack et al. 2003). Domains of ligands were identified by querying the PFAM database against the reference protein accession of each potential ligand in the PDZCBM proteome. Supplemental Table 3 contains the PFAM domain for each potential ligand. To identify gene families in the PDZCBM proteome, CLUSTALW (http://align.genome.jp/sit-bin/clustalw) was employed using default parameters to build a phylogenetic tree derived by neighbor-joining analysis applied to pairwise sequence distances. If the distance between any two pairs of proteins was <0.40, which empirically corresponded to known gene families and was consistent with common domains as a crosscheck, we considered such proteins a family. Co-citation profiles for the ligands with the term PDZ were performed by the MILANO (Microarray Literature Based Annotation) software (http://milano.md.huji.ac.il/) (Rubinstein and Simon 2005; Supplemental Table 3). Similarly, co-citation profiles for each of the PDZ and PDZCBM genes were identified for a common set of 40 citation terms representing biological processes in which PDZ genes have been implicated in the literature (Fig. (Fig.6).6 Mendelian disease identification To identify PDZ or PDZCBM genes known to result in Mendelian or complex diseases, the morbidmap was downloaded from (ftp://ftp.ncbi.nih.gov/repository/OMIM/morbidmap). In Supplemental Table 5, we provide a list of human PDZ genes and known or potential PDZCBM ligands associated with Mendelian diseases. In addition, we list complex traits, mutations found in primary tumors/cell lines, and translocations in humans involving PDZ/ligand genes. Those PDZ/ligands targeted by naturally occurring mutations or those derived by ENU mutagenesis of mouse or zebrafish are also furnished. Microarray filtering and annotation mapping The public version of the GNF human expression atlas version2 (Su et al. 2004) was obtained from Novartis (http://wombat. gnf.org/), including the primary .cel files, which used the U133a Affymetrix chip and a custom chip (GNF1H). The data set contains the expression values of 33,690 probes reflecting normalization of each array to a set of 100 housekeeping genes common to both the U133a and custom GNF1H array. Subsequently, global median scaling across the arrays was performed, resulting in the expression values across samples for each probe set. The absent/present calls were analyzed, and probe sets with 100% absent calls across all 79 human tissues were not included in the analysis. The data set was further filtered by requiring that a probe set have a threshold value >20 in at least one sample and a maximum–minimum expression value >100. The resultant filtering left 28,852 reliable probe sets for further analysis. For each U133a probe set meeting the above criteria, its corresponding UniGene ID and LocusLink ID were identified based on the combined annotation tables provided at the http://wombat.gnf.org/ and NetAffyx Web site (http://www.affymetrix.com). For the custom GNF1H chip, the mRNA/EST used to design the probe set was blasted against the exemplar sequences of the UniGene database (Build 116). Of the 28,852 probe sets, 26,789 mapped to 16,811 UniGene IDs. Neighborhood and statistical analysis Neighborhood analysis was performed using a stand-alone Perl script that was previously described with some modifications and describes the level of enrichment in co-expression between two gene sets (Mootha et al. 2003). In brief, for each query PDZ gene in the atlas, we rank order all other genes in the atlas on the basis of Euclidean distance of gene expression from the query gene after a Z-score transformation of each probe set across the 79 different sample types. For example, the PDZ neighborhood index NR=250(G), for an individual gene G and a radius of 250, is defined as the number of ligands (known literature or potential in our case) in the top 250 ranking genes from each PDZ gene. We subsequently plot the frequency distribution of N250, for example. In order to test the statistical significance of co-expression ligands with PDZ domain-encoding genes, we randomly generated 1000 sets of random ligands equal in size to the gold standard “literature PDZ ligands” (n = 192) or the sum of conserved potential PDZ ligands of Class I, II, and III (i.e., the PDZCBM, n = 899) and calculated for each set the weighted frequency distribution of N(R = 100,250,500). The empirical P-value was defined by permutation testing as the fraction of weighted frequency distributions of random ligand sets that exceeded the observed frequency distribution for literature ligands, and similarly for the composite conserved potential PDZ ligands of Class I, II, and III (both P < 0.01 at radius R = 250 and R = 500). We allowed many to one mappings of probe sets to genes. We adopted this strategy since in 1000 random samplings of gene sets equivalent in size to the PDZCBM (n = 899), we observed that on average 33.6% of genes were represented by multiple probes (minimum 27.5% and maximum 42.0%), with 35.4% and 43.1% of the actual PDZCBM and PDZ, respectively, mapping to multiple probes. These data support our use of the many to one mapping strategy, since the background random gene sets used to calculate empirical P-values had similar numbers of multiple probe sets as compared with the PDZ or PDZCBM. In addition, if we chose one to one mapping, there would be an insufficient number of ligands and PDZ genes to perform the analyses described in this paper. To arrive at a reasonable threshold of correlation between any pair of PDZ and potential ligand that may suggest a functional or physical interaction, we performed permutation testing. The distributions of pair-wise Pearson’s correlation coefficients (PCCs) were computed comparing PDZ genes and known interactors with a random set of ligands. We generated 1000 random ligand gene sets equivalent in size to the PDZCBM genes (n = 899) and plotted the average of frequency distribution of co-efficient for the permutations in Figure 4B. Hierarchical clustering To represent the expression profiles of PDZ genes and the PDZCBM Class I, II, and III ligands, hierarchical clustering with the centroid linkage method was performed using DCHIP (Schadt et al. 2001), using 1 − r as the distance metric, where r is the Pearson correlation coefficient, and the relative expression levels are displayed. For the identified 136 PDZ domain-encoding genes, there were 195 probe sets representing 107 out of the 136 PDZ genes on either the U133a or custom GNF1H arrays, which met our own (described above) as well as dCHIP’s default filtering criteria. For the PDZCBM genes, 86%(433/505), 84% (139/165), and 89% (204/229) of Class I, Class II, and Class III genes, respectively, were represented by at least one probe on the combined U133a and custom GNF1H microarrays. Heatmaps of hierarchical clustering of tissues expression and correlation values are based on a single probe set per gene chosen at random so as to not bias the visual presentation. The probe IDs for all PDZ and PDZCBM genes are included in Supplemental Table 6. We also performed hierarchical clustering using dCHIP on the co-citation expression vectors for PDZ and PDZCBM described above. Immune enrichment analysis In order to ascertain if any individual PDZ gene or PDZCBM gene set member is globally enriched in tissues of the immune system, we used the Wilcoxon rank sum test. To calculate the Wilcoxon statistic, we divided the 79 tissue/cell types into those of immune system origin (n = 22) and those that are not part of the immune system (57), making two sample classes. The list of tissues/cells considered of immune system origin can be found in Supplemental Table 1. The P-values from the Wilcoxon test for each probe representing a PDZ or PDZCBM gene were then calculated. Significant thresholds after multiple hypothesis testing were established by dividing the P-value 0.05 by the number of PDZ probes tested. Identification of D. melanogaster orthologs and phenotypic characterization We employed a reciprocal best-hit strategy using an E-cutoff value of 10−10 to identify potential Drosophila melanogaster (http://flybase.bio.indiana.edu/blast/) orthologs for each of the 136 human PDZ and 899 PDZCBM-encoding genes based on the reference protein sequences of the human genes. For the identified ortholog, each FlyBase record was reviewed individually for the gene being a known fly mutant, GO annotation, links to primary literature, and whether it had been tested in an RNAi screen for cell morphology (Kiger et al. 2003). We noted which of the orthologs has been linked to processes that PDZ complexes participate in, defined as actin cytoskeleton rearrangements, polarity (manifested by abnormal bristle morphology, for instance), adhesion (related phenotypes include wound closure), synaptic growth and transmission, and ion channel function (known or predicted by homology with known channels). Supplemental Table 4 contains detailed phenotypic information on fly orthologs. Immunofluorescence and BRET assays The bioluminescence resonance energy transfer (BRET) was performed as described previously (Lopez-Ilasaca et al. 2005). 5 × 105 U2OS cells were plated on coverslips. After 24 h, cells were transiently transfected with Transfectin (Biorad) following the manufacturer’s recommendations. Twenty-four hours post-transfection, coverslips were washed in PBS, then fixed and permeabilized in 3.5% PFA/0.1% Tween for 10 min at RT. Cells were incubated with primary antibody (anti-HA [Covance]; anti-FLAG [Sigma, MO]) for 1 h at RT. Coverslips were washed in PBS and incubated with secondary antibody (AlexaFluor 568 anti-mouse IgG [Molecular Probes]; AlexaFluor 350 conjugated-phalloidin [Molecular Probes]) for 1 h at RT before being mounted with Aqua Poly/Mount (Polysciences). Slides were visualized on an Olympus AX70 microscope for BCR and DLG1 and a confocal microscope. For CARD11/SH2D3C immunofluorescence, 7.5 × 106 Jurkat cells were electroporated with 15 μg of GFP- SH2D3C wild-type or GFP vector DNA. Twenty hours post-electroporation, dead cells were removed by centrifugation with Ficoll Paque. Cells were washed twice in media, and allowed to rest for 1 h at 37°C in serum-containing media. After 1 h, cells were resuspended in serum-containing media and loaded on poly-L-lysine (PLL)-coated slides previously incubated with 10 μg/mL each of CD3 and CD28 antibody for 2 h at 37°C. After the indicated time points, the cell suspension was removed, and slides were fixed and permeabilized in 3.5% PFA/0.1% Tween for 10 min at RT. Slides were washed in PBS and blocked for 30 min in 1% BSA/PBS before incubation with primary polyclonal antibody CARD11 Ab (Apotech ALX-210–903). Coverslips were washed in PBS and incubated with secondary antibody AlexaFluor 568 anti-rabbit IgG (Molecular Probes) and AlexaFluor 350 conjugated-phalloidin (Molecular Probes) diluted in blocking solution for 1 h at RT, before being mounted, sealed, and visualized as above. Plasmids and yeast two-hybrid screen An IGSF4C plasmid was generously provided by Dr. Y. Murakami (Fukuhara et al. 2001). The carboxy-terminal mutation IGSF4C-AAA was made by PCR-mediated mutagenesis by replacing the most carboxy-terminal three residues with alanine (A) residues. The Myc-CNKSR1 constructs were made by PCR of subregions of the 729-amino-acid wild-type-encoding gene corresponding to the following amino acids: SAM 1–210; CRIC 210–279; PDZ 279–363; PRO/PH-CT 363–279. The yeast two-hybrid screen used full-length CNKSR1 as bait employing a brain cDNA library. The IL-2 promoter luciferase construct was a kind gift of J.O. Liu (Sun et al. 1998). The BCR cDNA was a gift of Dr. J. Groffen (Children’s Hospital of Los Angeles, CA). A 3×HA-tagged BCR gene was cloned into the pKH3 vector at 5′ BamHI and 3′ EcoRI sites by releasing the BCR gene from a GST-tagged BCR gene in pLEF vector (Rudert et al. 1996) by using BamHI and EcoRI partial digestion. Mutant BCR (mutBCR) was constructed by replacing the last four residues LTKL with AAAA by PCR. A Flag-tagged SH2D3C gene was constructed by subcloning the coding sequencing of SH2D3C (Open Biosystems, accession no. BC032365) into the pCMV-FLAG5 vector at 5′ BamHI and 3′ SalI sites after being amplified by PCR. GFP-tagged SH2D3C was inserted into the pEGFP-C1 vector (Clontech) at 5′ BglII and 3′ SalI sites by releasing the full-length gene from the Flag-tagged SH2D3C construct by BamHI and SalI double digestion. GST-tagged SH2D3C was inserted into the modified pEBG-GST vector (pEBG-GST2) at 5′ BamHI to 3′ SalI sites by releasing the full-length gene from the Flag-tagged SH2D3C construct by BamHI and SalI double digestion. All GFP-tagged DLG1 constructs were from Dr. Craig Garner (Wu et al. 1998). All constructs were confirmed by DNA sequencing. Cell culture and transient transfection Human embryonic kidney 293T cells were maintained in Dulbecco’s modified Eagle’s medium (Invitrogen) supplemented with 10% fetal calf serum, 100 units/mL penicillin G, and 100 μg/mL streptomycin sulfate at 37°C and 5% CO2. One day before transfection, 1.5 × 106 293T cells were plated in 2 mL of DMEM medium per well in a six-well plate. The cells were transiently transfected with the indicated amount of plasmids by using lipofectamine 2000 (Invitrogen) according to the manufacturer’s instructions. Twenty-four hours later, cells were rinsed in ice-cold PBS and lysed with rotation for 30 min at 4°C in modified RIPA buffer (50 mM Tris-HCl, pH 7.6, 150 mM NaCl, 1% Triton-X-100, 0.1% SDS, 5 mM EDTA, 0.5% sodium deoxycholate, 10 mM sodium fluoride, 1 mM sodium vanadate, 0.4 mM PMSF, and protease cocktail [1 tablet for 10 mL buffer, Roche]), and insoluble materials were removed by centrifugation at 14,000 rpm for 15 min. Jurkat cells (1 107) were electroporated with 15 μg of GST vector or GST-SH2D3C. Twenty-four hours post-electroporation, cells were lysed in 1% Triton lysis buffer, and GST pull-downs were probed for endogenous CARD11. For immunoprecipitation, cell lysates were incubated with appropriate specific antibodies for 3 h at 4°C and subsequently mixed with antibody affinity gel (goat affinity-purified antibody to mouse IgG [ICN Pharmaceuticals]) for an additional 90 min at 4°C. The immunoprecipitates were washed three times with modified RIPA buffer. The immunoprecipitated proteins and total cell lysates were resolved by SDS-PAGE, transferred to immobilon-P transfer membranes (Millipore), and immounoblotted with the indicated antibodies. Horseradish peroxidase-conjugated anti-mouse or anti-rabbit antibodies (DakoCytomation California) were used as secondary reagents. Detection was performed by enhanced chemiluminescence with the Western Lightning Chemiluminescence Regent (PerkinElmer Life Sciences). Acknowledgments We thank Dan Podolsky, Joe Avruch, Vamsi Mootha, Brian Seed, and Frederick Alt for their support and helpful review of the manuscript. We thank Julio Bernabe-Ortiz for his work on the protein–protein interaction assays. We thank A.I. Su for his helpful discussions. C.G. is supported by a Crohn’s and Colitis Foundation Research Fellowship Grant. R.X. is supported by the Faculty Development Fund (GI unit) and the CCFA. Footnotes [Supplemental material is available online at www.genome.org.] Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.5285206 References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||
Trends Biochem Sci. 1995 Sep; 20(9):350.
[Trends Biochem Sci. 1995]Annu Rev Neurosci. 2001; 24():1-29.
[Annu Rev Neurosci. 2001]Mol Biol Rep. 2003 Jun; 30(2):69-82.
[Mol Biol Rep. 2003]Nat Rev Mol Cell Biol. 2004 Mar; 5(3):220-31.
[Nat Rev Mol Cell Biol. 2004]Nature. 2003 Mar 20; 422(6929):326-30.
[Nature. 2003]Science. 1997 Jan 3; 275(5296):73-7.
[Science. 1997]Nat Struct Mol Biol. 2004 Nov; 11(11):1122-7.
[Nat Struct Mol Biol. 2004]Mol Cell. 2004 Mar 12; 13(5):665-76.
[Mol Cell. 2004]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]J Biol Chem. 2000 Sep 15; 275(37):28774-84.
[J Biol Chem. 2000]Curr Biol. 2003 Jul 15; 13(14):1252-8.
[Curr Biol. 2003]Immunity. 2003 Jun; 18(6):763-75.
[Immunity. 2003]J Exp Med. 2004 Nov 1; 200(9):1167-77.
[J Exp Med. 2004]Annu Rev Neurosci. 2001; 24():1-29.
[Annu Rev Neurosci. 2001]Nature. 2005 Mar 17; 434(7031):338-45.
[Nature. 2005]J Cell Biol. 1999 Dec 13; 147(6):1351-63.
[J Cell Biol. 1999]FEBS Lett. 2000 Sep 29; 482(1-2):54-8.
[FEBS Lett. 2000]Acta Biochim Pol. 2003; 50(4):985-1017.
[Acta Biochim Pol. 2003]BMC Bioinformatics. 2005 Jan 20; 6():12.
[BMC Bioinformatics. 2005]Curr Biol. 2004 Jul 13; 14(13):R508-10.
[Curr Biol. 2004]Semin Cell Dev Biol. 2000 Aug; 11(4):281-9.
[Semin Cell Dev Biol. 2000]Nat Rev Neurosci. 2004 Oct; 5(10):771-81.
[Nat Rev Neurosci. 2004]Curr Opin Cell Biol. 2004 Oct; 16(5):513-21.
[Curr Opin Cell Biol. 2004]Cell. 1994 Jun 17; 77(6):869-80.
[Cell. 1994]Mol Cell. 2000 Mar; 5(3):423-33.
[Mol Cell. 2000]Curr Opin Struct Biol. 2004 Jun; 14(3):292-9.
[Curr Opin Struct Biol. 2004]Genome Biol. 2005; 6(5):R40.
[Genome Biol. 2005]J Biol. 2003; 2(4):27.
[J Biol. 2003]Development. 1998 Mar; 125(6):1149-59.
[Development. 1998]Dev Cell. 2001 Nov; 1(5):691-703.
[Dev Cell. 2001]Development. 2001 Dec; 128(24):5061-73.
[Development. 2001]Dev Genes Evol. 2002 Jun; 212(5):230-8.
[Dev Genes Evol. 2002]Science. 2003 Dec 5; 302(5651):1727-36.
[Science. 2003]Proc Natl Acad Sci U S A. 2003 Jan 21; 100(2):605-10.
[Proc Natl Acad Sci U S A. 2003]Genome Res. 2003 Aug; 13(8):1828-37.
[Genome Res. 2003]Cell Mol Life Sci. 2004 Apr; 61(7-8):911-29.
[Cell Mol Life Sci. 2004]J Biol Chem. 2000 Sep 15; 275(37):28774-84.
[J Biol Chem. 2000]Nat Rev Neurosci. 2004 Oct; 5(10):771-81.
[Nat Rev Neurosci. 2004]Dev Biol. 1988 Jun; 127(2):392-407.
[Dev Biol. 1988]Cell. 1991 Aug 9; 66(3):451-64.
[Cell. 1991]Exp Cell Res. 2003 Nov 1; 290(2):265-74.
[Exp Cell Res. 2003]EMBO J. 2001 Apr 2; 20(7):1620-9.
[EMBO J. 2001]Genome Res. 2002 Jan; 12(1):37-46.
[Genome Res. 2002]Proc Natl Acad Sci U S A. 2003 Jul 8; 100(14):8348-53.
[Proc Natl Acad Sci U S A. 2003]Science. 2004 Nov 26; 306(5701):1555-8.
[Science. 2004]Curr Biol. 2003 Jul 15; 13(14):1252-8.
[Curr Biol. 2003]Immunity. 2003 Jun; 18(6):763-75.
[Immunity. 2003]Immunity. 2003 Jun; 18(6):751-62.
[Immunity. 2003]Curr Biol. 2003 Jul 15; 13(14):1247-51.
[Curr Biol. 2003]Proc Natl Acad Sci U S A. 1998 Jul 21; 95(15):8779-84.
[Proc Natl Acad Sci U S A. 1998]Cell. 1998 Sep 18; 94(6):773-82.
[Cell. 1998]Immunity. 2005 Jun; 22(6):655-6.
[Immunity. 2005]Science. 2004 Jan 23; 303(5657):540-3.
[Science. 2004]Science. 1996 May 17; 272(5264):1020-3.
[Science. 1996]Proc Natl Acad Sci U S A. 2000 May 9; 97(10):5167-72.
[Proc Natl Acad Sci U S A. 2000]Science. 2004 Jul 2; 305(5680):61-6.
[Science. 2004]Nature. 2000 Feb 10; 403(6770):623-7.
[Nature. 2000]Proc Natl Acad Sci U S A. 2001 Apr 10; 98(8):4569-74.
[Proc Natl Acad Sci U S A. 2001]Genome Biol. 2004; 5(12):R96.
[Genome Biol. 2004]PLoS Biol. 2004 Nov; 2(11):e363.
[PLoS Biol. 2004]Bioinformatics. 2005 Mar; 21(6):827-8.
[Bioinformatics. 2005]Annu Rev Neurosci. 2001; 24():1-29.
[Annu Rev Neurosci. 2001]Nature. 2005 Mar 17; 434(7031):338-45.
[Nature. 2005]J Mol Biol. 2001 Jan 19; 305(3):567-80.
[J Mol Biol. 2001]Genome Biol. 2003; 4(10):R70.
[Genome Biol. 2003]BMC Bioinformatics. 2005 Jan 20; 6():12.
[BMC Bioinformatics. 2005]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]Proc Natl Acad Sci U S A. 2003 Jan 21; 100(2):605-10.
[Proc Natl Acad Sci U S A. 2003]J Cell Biochem Suppl. 2001; Suppl 37():120-5.
[J Cell Biochem Suppl. 2001]J Biol. 2003; 2(4):27.
[J Biol. 2003]FEBS Lett. 2005 Jan 31; 579(3):648-54.
[FEBS Lett. 2005]Oncogene. 2001 Aug 30; 20(38):5401-7.
[Oncogene. 2001]Immunity. 1998 Jun; 8(6):703-11.
[Immunity. 1998]Gene. 1996 Mar 9; 169(2):281-2.
[Gene. 1996]J Cell Sci. 1998 Aug; 111 ( Pt 16)():2365-76.
[J Cell Sci. 1998]Annu Rev Neurosci. 2001; 24():1-29.
[Annu Rev Neurosci. 2001]J Mol Biol. 2001 Jan 19; 305(3):567-80.
[J Mol Biol. 2001]Proc Natl Acad Sci U S A. 2004 Apr 20; 101(16):6062-7.
[Proc Natl Acad Sci U S A. 2004]J Cell Biochem Suppl. 2001; Suppl 37():120-5.
[J Cell Biochem Suppl. 2001]BMC Bioinformatics. 2005 Jan 20; 6():12.
[BMC Bioinformatics. 2005]Science. 2003 Dec 5; 302(5651):1727-36.
[Science. 2003]Proc Natl Acad Sci U S A. 2003 Jan 21; 100(2):605-10.
[Proc Natl Acad Sci U S A. 2003]