![]() | ![]() |
Formats:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright : © 2006 Lagerström et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. The G Protein–Coupled Receptor Subset of the Chicken Genome Department of Neuroscience, Uppsala University, Uppsala, Sweden Philip Bourne, Editor University of California San Diego, United States of America * To whom correspondence should be addressed. E-mail: robert.fredriksson/at/neuro.uu.se ¤ Current address: Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden Received January 26, 2006; Accepted April 6, 2006. This article has been cited by other articles in PMC.Abstract G protein–coupled receptors (GPCRs) are one of the largest families of proteins, and here we scan the recently sequenced chicken genome for GPCRs. We use a homology-based approach, utilizing comparisons with all human GPCRs, to detect and verify chicken GPCRs from translated genomic alignments and Genscan predictions. We present 557 manually curated sequences for GPCRs from the chicken genome, of which 455 were previously not annotated. More than 60% of the chicken Genscan gene predictions with a human ortholog needed curation, which drastically changed the average percentage identity between the human–chicken orthologous pairs (from 56.3% to 72.9%). Of the non-olfactory chicken GPCRs, 79% had a one-to-one orthologous relationship to a human GPCR. The Frizzled, Secretin, and subgroups of the Rhodopsin families have high proportions of orthologous pairs, although the percentage of amino acid identity varies. Other groups show large differences, such as the Adhesion family and GPCRs that bind exogenous ligands. The chicken has only three bitter Taste 2 receptors, and it also lacks an ortholog to human TAS1R2 (one of three GPCRs in the human genome in the Taste 1 receptor family [TAS1R]), implying that the chicken's ability and mode of detecting both bitter and sweet taste may differ from the human's. The chicken genome contains at least 229 olfactory receptors, and the majority of these (218) originate from a chicken-specific expansion. To our knowledge, this dataset of chicken GPCRs is the largest curated dataset from a single gene family from a non-mammalian vertebrate. Both the updated human GPCR dataset, as well the chicken GPCR dataset, are available for download. Synopsis Man and chicken are very different, but how is that difference related to our respective gene repertoire? The authors studied the family of G protein–coupled receptors (GPCRs), which in man contains about 791 proteins. These are found in the cell membrane, where they recognize substances, thereby functioning as mediators of signals across the cellular membrane. GPCRs respond to physiologically important substances such as hormones and neurotransmitters. In this paper, the publicly available genomic sequence from the domestic chicken is used to identify the entire repertoire of GPCRs in this species. The authors found 557 GPCRs and compared the chicken and human receptors; they concluded that out of the 328 chicken receptors that are not involved in olfaction, more than 250 have a corresponding human receptor. The majority of the differences between the chicken and man are within three groups of GPCRs—the receptors for olfaction, bitter taste, and the receptors involved in the immune system. The chicken GPCR sequences obtained here will be useful for identification of GPCRs in other species that are more distantly related to man, such as fish or insects. The domestic chicken represents the leading experimental model among the avian species and also serves as an important source of food worldwide. Introduction Several vertebrate genomes have been sequenced since the release of the first draft sequence of the human genome in 2001 [1,2], but the first project to fill the large evolutionary gap between mammals and fish was the completion of the genome of the red jungle fowl, Gallus gallus, released in December 2004 [3]. Besides bridging the gap between mammals and other vertebrates, the chicken represents the leading experimental model among the avian species and serves also as an important source of food worldwide. The initial chicken genome annotation suggested between 20,000 and 23,000 protein-coding genes [3], which is similar to what has been estimated for the human genome [4,5]. It is, however, well known that automatic annotations of new genomes are error-prone, and tremendous work is left in annotation of the different protein families. The prediction of coding sequences of multi-exon genes is complicated, and one of the best programs, Genscan, has a sensitivity and specificity of about 90% for detecting exons, meaning that the average gene with about ten exons is very likely to have at least one exon wrongly predicted. Moreover, it has been estimated that the number of exons that have both splice sites correctly predicted by Genscan is as low as only 19% [6]. This shows that the likelihood of correctly predicting multi-exon genes is fairly low. This has, of course, a substantial impact on subsequent analysis such as phylogeny and calculations of evolutionary distances, when gene predictions are used instead of curated full-length genes. The superfamily of G protein–coupled receptors (GPCRs) is one of the largest families of proteins in the human genome [1,2] and probably also in most other vertebrate species [7]. The GPCR family has approximately 800 members in man, and these participate in numerous important physiological functions and are also targets for many therapeutic drugs. Their natural ligands are particularly diverse including ions, organic odorants, amines, peptides, proteins, lipids, nucleotides, and photons, which are all able to activate GPCRs. The name GPCR indicates that these receptors interact with G-proteins, but the main common characteristic of GPCRs are seven stretches of about 20–35 consecutive amino acid residues that show a high degree of hydrophobicity and represent α-helixes that span the plasma membrane. The complete repertoire of GPCRs has been analyzed for several vertebrate genomes such as the human [8,9], mouse [8], mosquito [10], and pufferfish [11] genomes. The numbers of GPCRs in gene-prediction datasets from 13 eukaryotic genomes, not including the chicken genome, was also recently investigated [7]. Most of the genome-wide analyses have, however, been performed on raw gene predictions which make reliable phylogenetic analyses impossible. Currently, only the repertoire of GPCRs in the human [8,9] and mouse [8] genomes has been analyzed using curated-sequence datasets. Both physiological and structural features have been used to classify GPCRs [12–14]. These classification systems were constructed before the completion of human and other vertebrate genomes and hence did not classify atypical receptors not yet identified, such as most of the Adhesion (long N-termini–transmembrane-7 [LN-TM7]) family receptors [15] and the bitter-taste receptors. In addition, these classification systems did not phylogenetically subdivide the large rhodopsin family, which has approximately 660 members in humans, into groups. Recently, we performed large-scale systematic phylogenetic analyses, including the majority of the GPCRs in the human genome [9]. This provided us with the GRAFS system showing five main families of GPCRs named Glutamate (G), Rhodopsin (R), Adhesion (A), Frizzled/Taste2 (F), and Secretin (S). Moreover, we subdivided the large Rhodopsin family into four groups; α, β, γ, and δ. The grouping was performed with strict phylogenetic criteria where some atypical human receptors were placed into a group designated Other. In this paper, we scan the recently sequenced chicken genome for GPCRs, using a multitude of methods to obtain a nearly complete set of chicken GPCRs. We manually edit and verify, i.e., curate, the coding regions of each of the GPCRs (557 in total), to provide the first high-quality collection of GPCR sequences from the full genome of a non-mammalian species. We perform maximum-likelihood phylogenetic analysis on these chicken GPCRs together with 750 human GPCRs, which is an updated version of the dataset used previously ([9]; unpublished data), which comprises the entire human GPCR family. We present a strategy of detecting and verifying genes from a genomic assembly and evaluate the accuracy of Genscan predictions in relation to our homology-based gene-prediction approach. Results Identification and Verification of Chicken GPCRs Initially, 102 known chicken GPCRs from the nr (http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi) database at NCBI were identified using BLASTP with all human GPCRs as baits [9]. In Figure 1
Furthermore, three additional steps were taken to identify GPCR sequences that were not represented among the Genscan predictions. In step 2, we aligned all human GPCRs as well as 505 putative chicken GPCRs against the chicken genome using TBLASTN. All genomic positions that were hit by a human GPCR, but not by a chicken GPCR, were manually inspected for additional new chicken GPCRs. Only the best hit from each search was considered. Step 2 identified 25 new chicken GPCRs. In step 3, we first produced neighbor-joining phylogenetic trees for all chicken and human GPCRs. Subsequently, in all cases where a one-to-one orthologous relationship between a human and a chicken sequence was not present, we manually identified all human sequences that did lack an orthologous chicken sequence and used these to search the chicken genome using TBLASTN. All genomic positions containing a human hit but not a chicken GPCR were then manually inspected for additional new chicken GPCRs. This resulted in the identification of 22 additional chicken GPCRs. In step 4, we used a strategy similar to that used in step 2, but here we used a limited number of GPCRs as baits and inspected all genomic positions, not only the top scoring, for additional new chicken GPCRs. This step identified 18 new chicken GPCRs. In total, 455 new chicken GPCRs were identified. The chicken GPCR dataset was divided into Adhesion, Frizzled, Glutamate, Secretin, Rhodopsin, and Taste 2 families based on the human dataset. The large Rhodopsin family was further subdivided into α, β, γ, δ, and Olfactory. Table S1 describes the step in which receptors from the different families were identified. The Accuracy of Genscan Predictions It has to be noted that in this process one crucial, but tedious, step was the manual editing/assembly/verification step. Here, wrongly predicted regions were excluded, new splice sites were selected to correct frame-shifted regions of the protein, and the most likely start codon was selected based on alignment with the human orthologous protein. In addition, missing regions were identified by the use of translated alignments against the genomic regions in those cases where those missing sequences were expected to be found from comparison with the human genome. To illustrate the importance of this process, we performed pairwise alignments between the corrected protein and the initial Genscan prediction for all non-olfactory sequences with a human ortholog from step 1. From each of these alignments, the percentage identity between the sequences was calculated, using a scale where 100 means completely identical and 0 means no identity. The result of this comparison can be seen in Figure 2
When comparing species, one factor that is often discussed is the sequence identity of orthologous genes as this gives a measure of the evolutionary distance between the two species. Figure 2 Phylogenetic Analysis Phylogenetic analysis was performed by first calculating neighbor-joining trees for each of the ten groups described in Table 1 (all except Other), and then mapping maximum-likelihood branch lengths onto the neighbor-joining topology using TreePuzzle. The topology for the Adhesion tree was calculated using maximium parsimony. The naming of the chicken receptors follows the guidelines of CHICKBASE hosted at the Roslin Institute (http://www.thearkdb.org) (see Figure 3
In addition to the ten subgroups presented in the phylogenetic trees, 19 chicken GPCRs could not be classified into any of these subgroups. This is similar to the case with the human GPCRs, where 26 receptors could not be placed into any of the phylogenetic groups, and these are therefore placed into a group designated Other. It is important to note that the receptors in this group are, in general, not related to each other, although a few of these do have high sequence identity to one other receptor from the Other group. To assess possible sequence relationships between these and the human GPCRs, a BLAST database was built from the protein sequences of all the human GPCRs, and the 19 chicken GPCRs were compared against the database using BLASTP. In Table 2, we present the results from this analysis. It is interesting to note that one chicken GPCR, here designated ggNOVEL26, appears to lack clear similarity to any human GPCR using BLASTP alignments (cut off E-value > 10). We chose to include this clearly atypical GPCR as it shows similarity to the 7tm_1 model, the model derived from the Rhodopsin family of GPCRs, using the online version of RPS-BLAST. In step 4 (Figure 1
Figure 4 In group IV, the ortholog to human cadherin EGF LAG seven-pass G-type receptor 2 (CELSR2) is missing, while groups III and V are identical between the species regarding gene content. Group VI apparently has expanded in mammals or, more specifically, in humans. Chicken ggGPR111/115 may be the common ancestor of human GPR111 and GPR115 based on chromosomal localization and phylogenetic relationship. GPR111 and GPR115 are positioned in close proximity on human Chromosome 6p12.3 and may therefore be the result of a local duplication from the common ancestor GPR111/GPR115, still present as a single copy in the chicken. In group VII, there is a one-to-one relationship between the human and the chicken, while group VIII lacks a chicken ortholog to human GPR56. Figure 4 The Taste 1 receptor family (TAS1R) consists in the human genome of three different GPCRs—TAS1R1, TAS1R2, and TAS1R3—which function as protomers in heterodimeric complexes [21–23]. Intriguingly, chickens appear to lack an ortholog to human TAS1R2. We also searched the chicken EST databases (http://www.chick.umist.ac.uk) for mRNA sequences representing the TAS1R2 receptor, but were unable to identify any sequence representing this receptor. The signaling combination TAS1R1/TAS1R3 is responsible for the l-glutamate taste (umami), whereas TAS1R2/TAS1R3 detects sweeteners [22,24]. The loss of TAS1R2 in the chicken was confirmed while this manuscript was under preparation [25]. The human genome contains a group of retinoic acid–inducible genes (RAIGs) or G protein–coupled receptor family C (GPRC) GPRC5A–5D [26–29]. The genes for these orphan receptors are well conserved in the chicken, although human GPRC5A and GPRC5D may represent a duplication in mammals of GPRC5A/5D that we found in chicken. This scenario is also strengthened by their localization in close proximity on the human Chromosome 12p13.2–p13.1, which is a syntenic region to chicken Chromosome 1 where ggGPRC5A/D is positioned [30]. The chicken appears to lack the ortholog for the human GABABR (gamma aminobutyric acid–binding receptor) 1. Studies have shown that human GABABR2 is unable to function without the GABABR1 unit, which is responsible for the ligand binding [28,31,32]. No EST sequences could be found for the GABABR1 receptor in the chicken (http://www.chick.umist.ac.uk); however, a search with the human GABABR1 revealed a short stretch of 23 identical amino acids matching an unlocalized chicken scaffold (chrUn: 96630977–96631045), which may represent a part of a GABABR1 ortholog, partially missing in this assembly. If this is not the case, GABABR receptors in the chicken could function in a very different mode compared with GABABR receptors in mammals, which would be interesting to explore further. The Frizzled receptor family (FZD; Figure 4 The human genome contains 25 functional human Taste 2 receptors (T2Rs) responsible for detecting bitter-tasting compounds in addition to ten pseudo genes [35]. However, the chicken genome contains only three T2Rs (Figure 4 Figure 4 The Rhodopsin family of GPCRs has previously been divided into α-, β-, γ-, and δ-groups based on phylogeny [9]. The α-group consists of serotonin, dopamine, adrenergic, muscarinic, histamine, and trace amine receptors (TARs); endothelial differentiation G protein–coupled receptors (EDGRs); melanocortin, prostaglandin, and opsin receptors; and several orphan GPCRs. In humans, this group contains a total of 101 receptors, while we found 92 in the chicken genome (Figure 5 The adenosine receptors (ADORAs) are relatively well conserved; however, human ADORA2B appears to be represented by two subtypes in the chicken, giving rise to ggADORA2Ba and ggADORA2Bb, where ggADORA2Ba appears to be evolving rapidly. All four chicken ADORAs have previously been cloned. In the databases, two unique sequences are named adenosine receptor 2B and, to avoid confusion, we have renamed those ggADORA2Ba and ggADORA2Bb according to our phylogeny. The chicken histamine receptor family (HRH) consists of ggHRH1, ggHRH2, and three human HRH3-like subtypes: ggHRH3, ggHRH3n1a, and ggHRH3n1b (Figure 5 The chicken genome contains five visual pigment genes, the rod pigment (rhodopsin) [41] and the four cone pigments—iodopsin (red) [42], blue-sensitive opsin, violet-sensitive opsin, and green-sensitive opsin [43]. As previously described, the human OPN1LW (red) and OPN1MW (green) are the result of a local duplication on the X chromosome [44]. Besides these already known genes, we identified a novel opsin-related gene in the chicken: ggOPSINn1 localized basally in the opsin-cluster (Figure 5 The TAR family has been subjected to different evolutionary events in different species; for example, several subtype-specific expansions both in zebrafish and rodent genomes. The human genome contains TAR1, TAR3, TAR4, TAR5, and PNR, while TAR2 is a pseudogene [45]. However, the TAR2 subtype seems to be present as a functional receptor both in rodents and chickens (Figure 5 The β-group in the Rhodopsin family consists of 43 human receptors (unpublished data) and 45 potential chicken receptors (Table 1; Figure 5 The chicken genome contains six neuropeptide Y receptor (NPYR) subtypes compared to four subtypes in human [49,50]. Besides NPY1R, NPY2R, NPY4R (PPYR1), and NPY5R—which both genomes contain—the chicken genome also contains NPY6R, homologous to human pseudogene npy6r, and NPY7R, homologous to zebrafish Npy7r [50,51]. NPY7R is most closely related to NPY2R, based on phylogeny, and appears to be an ancient relative to this gene, which is missing in mammals. Although npy6r is a pseudogene in the human, as well as in many other mammals, it is present as a functional receptor in the mouse, but appears to be missing completely in the rat [52]. Previous phylogenetic analysis of PRLHRs from fish, chicken, and mammals show that these receptors group with NPYRs with high bootstrap support [53]. Based on chromosomal localization and phylogenetic relationship, it was suggested that the tetraploidization events generated four copies of the ancient PRLHR gene—PRLHR1, PRLHR1b, PRLHR2, and PRLHR2b—which were differentially retained in different genomes. The mammalian genomes appear to contain only the PRLHR1 subtype, while the chicken genome contains PRLHR1, PRLHR1b, and PRLHR2, and the teleost genome from pufferfish contains the PRLHR1b and the PRLHR2b subtypes [53]. The chicken genome is missing the ortholog for the human orexin 1 receptor (HCRTR1), while the endothelin receptor family and the orphan GPR73 contain more subtypes than do the human families. The γ-group contains receptors such as the chemokine, opioid, galanin, and somatostatin receptors [9]. The chicken genome contains 46 potential γ-group GPCRs, while the human genome contains 64 (Table 1; unpublished data). We expanded this group with 29 novel potential chicken GPCRs (Table 1). The main differences are in the chemokine receptor family, which has expanded in mammals, and in the formyl peptide-binding receptor family, which is missing in the chicken (Figure 5 Compared to the human genome, the chicken genome apparently has lost several receptors from the CXCR subclass (Figure 5 The somatostatin receptors 1–5 display a clear one-to-one orthologous relationship (Figure 5 The fourth Rhodopsin family group, the δ-group, contains the olfactory receptors, the nucleotide-binding receptors, the glucoprotein receptors, several orphan receptors, and the mas-related G protein–coupled receptor (MRG) cluster 9 (Figure 5 The second case regards the MRG cluster which, in the human genome, contains one mas-1 oncogene receptor gene and nine MRGs [61]. The chicken genome contains only one clear ortholog to this family, ggMAS (Figure 5 The third case is the olfactory receptors. The human genome holds 388 functional olfactory receptors [63,64], while the chicken genome contains at least 229. A majority of the chicken olfactory receptors (n = 218) represent an expansion of genes similar to the human 01.01.01/OR5BF1 gene [3] (for nomenclature see [64,65]). However, the eight functional chicken olfactory genes that have been cloned so far—COR1–6, COR7a, and COR7b [66]—are all localized in close proximity on chicken Chromosome 5 and are not part of that expansion. Instead, receptors COR1–6 form a separate sub-tree close to group 11.31.01–11.31.05 of human olfactory receptors, all tightly positioned together on human Chromosome 11 (for phylogenetic tree, see Dataset S1; for nomenclature, see [65]). Based on their phylogenetic relationship, the chicken COR1–6 and human 11.31.01–11.31.05 may share a recent common ancestor. In addition to these six chicken receptors, COR7a, COR7b, and the novel receptors ggOR62, ggOR220, and ggOR221 also position outside the 01.01.01/OR5BF1 gene expansion group. COR7a, COR7b, and ggOR220 group close to human olfactory receptors 11.47.01–11.47.03, while ggOR62 and ggOR221 group close to human receptors 11.44.01 and 15.02.01, respectively. All 229 chicken olfactory receptors represent unique genomic positions in the chicken assembly because Genscan was set to exclude predicted alternative transcripts. All 557 reported chicken receptors can be found in Dataset S2 (description) and Dataset S3 (sequences). Discussion In this paper, we present a collection of 557 manually curated GPCR sequences from the chicken genome. The sequences were obtained through a four-step search procedure with a high degree of manual verification, and it is likely that this dataset contains most of the GPCR sequences present in the current assembly of the chicken genome. The aim of both the manual curation and the assembly step are to ensure that all pseudogenes, i.e., genes with coding regions interrupted by stop-codons, are excluded and that the exon–intron organizations are correct. It is well established that automatic prediction and annotation of proteins from genomic sequences is highly error-prone [6]. We show that only 62 out of the 158 non-olfactory chicken GPCRs with a clear orthologous relation to a human GPCR were correctly predicted by Genscan (41%) (Figure 2 It is interesting to note that in a study by Hillier et al. describing the initial annotation of the chicken genome, the overall average sequence identity between 10,094 protein sequences orthologous between the human and the chicken was found to be around 80% [3]. It has to be noted that these sequences are all computer predictions on a draft genome assembly. The percentage identity could be even higher if it was based on curated sequences using the same reasoning as above. However, the dataset used by Hillier et al. contains only “core orthologs”, i.e., sequences that are conserved between the human, chicken, and Fugu. It is possible that the automatic procedure used to obtain these “core” sequence predictions has enriched the dataset for highly conserved sequences. Our results suggest that the orthologous GPCRs are, in general, less well conserved between the human and the chicken than the average protein. This is also indicated by data presented by Hillier et al. where the gene ontology (GO) category “GPCR-signaling”, containing 323 orthologous pairs, was the 16th least-conserved category at primary-sequence level, out of 20 categories from the biological process GO-tree [3]. According to the definition, this GO-class contains, apart from GPCRs, other proteins associated with the signaling cascade of GPCRs. Assuming that there are around 250 orthologous pairs representing GPCRs in this category (which is approximately what we identified), other proteins—such as G-proteins, peptides, and enzymes—constitute around 33% of the proteins in this GO category. Many of these proteins, for example the G-proteins, are generally known to be well conserved between species [67,68]. Taken together, these data suggest that GPCRs, even when confounding factors such as gene duplications, expansions, and deletions are considered, evolve more rapidly than most other protein families. GPCRs constitute 3.2% and 5.2% of the genes in the human and mouse genomes, respectively, considering that both these genomes have about 25,000 protein-coding genes [4,5]. The main difference in the GPCR repertoire between the mouse and the human genomes can be attributed to gene expansions of olfactory receptors in mouse. The initial estimates of the chicken genome indicate that it also contains about 20,000–23,000 protein-coding genes [3], and considering the number of GPCRs that we find, the overall percentage of GPCR is lower in chickens or between 2.2% and 2.4%. This difference can be explained by the fact that chicken has a lower number of olfactory receptors. If the olfactory receptors are excluded, the overall percentage of GPCR genes is 1.65% for the human genome and between 1.32% and 1.43% for chicken. We have previously shown that the percentages of all protein predictions that are GPCR sequences are, in general, similar in different vertebrates as well as in invertebrates [7]. The only large deviation between species was found to be related to large expansions of certain GPCR families, interestingly always receptors for non-endogenous ligands. Examples are the chemosensory receptors in the nematode (Caenorhabditis elegans), gustatory receptors in insects, and olfactory receptors in the mouse [7]. The overall percentage of GPCRs for endogenous ligands is thus remarkably constant for all the bilateral species investigated in detail so far. Our phylogenetic analysis with all known human and chicken GPCRs is the first detailed comparison between the repertoire of GPCRs in non-mammalian and mammalian species. This analysis shows the orthologous pairs of the chicken and human GPCRs, and we conclude that all the main groups of GPCRs, with the exception of Taste 2 and the olfactory subset of the Rhodopsin γ-group are, in general, well conserved between the human and chicken. The average sequence identity between orthologous pairs of chicken and human proteins is around 73% in their TM regions, but there is a considerable variation in identity between the different families and groups of GPCRs. The Adhesion family displays the lowest percentage identity (68.8%) between orthologous pairs, and this could be due to the fact that the Adhesion GPCRs utilize the TM regions mostly as a membrane-anchor and signal-transmission unit, and not primarily for complex ligand interactions. FZD is well conserved between the human and the chicken, regarding both repertoire and primary sequence (81.4%), which could relate to their important role in basic functions such as controlling cell fate, proliferation, and polarity during metazoan development [33,34]. In fact, FZD is the only GPCR family that is close to the figure of 80% conservation that is reported as the mean value for the entire set of orthologous pairs in the chicken and the human genomes [3]. It is also interesting to note that the proteins in the Rhodopsin family β-group appear to evolve more slowly (77.3%) than the other Rhodopsin groups (69.2%–73.8%); this could be due to the fact that all ligands in this family are peptides and that peptide ligands may require more interaction points than smaller non-peptidergic substances—which is likely to conserve the structure and thus the amino acid sequence of the Rhodopsin β-group receptors. According to our phylogenetic analysis, 259 of the 557 chicken receptors have a one-to-one ortholog in the human genome. It is, however, important to note that comparison of only two genomes may provide some wrong conclusions about orthologous relationships in individual cases. This is because deletions of one member in each of two related pairs in both species (double loss) may cause topology that wrongly indicates those genes that are orthologs. We used data from the rodent or fish genomes to clarify in more detail the true phylogenetic relationship in cases where no clear one-to-one orthologous relationships were present—for example TAR2 and NPY6R, which are both pseudogenes in the human but are functional in the mouse. Overall, the orthologous pairs for many of the GPCR groups are, in general, remarkably well conserved between the human and the chicken, despite the fact that this family of proteins appears to evolve relatively fast considering the primary sequence. For example, the Secretin and the Frizzled families display a one-to-one human–chicken orthologous relationship for all but one protein in each of the two families (Figure 4 Olfaction is mediated by GPCRs expressed in the olfactory epithelium, and it is one of the major neurosensory functions by which vertebrates such as humans and chickens investigate their external chemical environment [69]. The 388 functional olfactory receptors in the human genome can be divided into class I and class II based on phylogenetic criteria [63,64]. The chicken genome contains at least 229 potentially functional class II olfactory receptors, while class I receptors appear to be missing. Class I olfactory receptors are present both in teleosts and in mammals, and have long been considered to recognize water-soluble odorants, while class II receptors mediate the effects of airborne odorants [64,70]. A majority of the chicken olfactory receptors, 218 genes, represent an expansion of genes most similar to the human 01.01.01/OR5BF1 gene in class II [3] (for nomenclature, see [64,65]). It is not known which ligands these novel proteins recognize, but it is likely that these ligands are some kind of volatile airborne substance. Several studies have addressed the deficient homing ability in anosmic birds; these studies indicate that birds use the olfactory system for navigation, particularly in unknown terrain [71–73]. Birds apparently also use the olfactory system for discriminating between individuals and finding their nests, for finding food, and for avoiding toxic insects and dangerous predators [74–76]. One possibility could be that the large number of class II olfactory receptors could be involved in such functions. The gustatory system in humans can detect and differentiate between hundreds of compounds, allowing us to avoid toxic compounds and to select nutritious food [77]. Three of the five taste modalities—sweet, bitter, salt, sour, and l-glutamate (umami)—are mediated through GPCRs. Sweet and umami are mediated by the TAS1Rs which, in humans, consist of three different GPCRs—TAS1R1, TAS1R2, and TAS1R3 [21–23]—while bitter taste is mediated by the Taste 2 receptor family (TAS2R) [35–37]. The TAS1Rs function as protomers in heterodimeric complexes [21–23], where the dimer complex between TAS1R1 and TAS1R3 is responsible for the l-glutamate taste (umami), whereas the combination of TAS1R2 and TAS1R3 detects sweeteners [22,24]. Intriguingly, chickens appear to lack an ortholog to human TAS1R2. This may imply that the chicken's mode of detecting sweet taste differs from that of humans, since the TAS1R2 unit, which is missing in the chicken, is the interaction point for sweet-tasting compounds such as aspartame and neotame, while the TAS1R3 unit is responsible for the intracellular signaling [78]. However, sweet compounds such as lactisole, brazzein, and cyclamate have been shown to interact directly with the TAS1R3 unit [78–80], which could implicate a sweet-detecting ability despite the lack of a TAS1R2 unit. The human genome contains 25 functional genes that code for T2Rs, which are responsible for detecting bitter-tasting compounds [35]. Intriguingly the chicken genome contains only three bitter-tasting T2Rs (Figure 4 Humans are trichromatic i.e., have the ability to discriminate between three different colors (wavelengths), while the chicken is tetrachromatic. The chicken genome contains five visual pigment genes; the rod pigment (rhodopsin) [41] and the four cone pigments—iodopsin (red) [42], blue-sensitive opsin, violet-sensitive opsin, and green-sensitive opsin [43]. Besides these already known genes, we have identified a novel opsin-related gene in the chicken—ggOPSINn1 localized basally in the phylogenetic opsin-cluster (Figure 5 In summary, we scanned the recently sequenced chicken genome for GPCRs to obtain manually edited and verified coding regions of a total of 557 GPCRs. To our knowledge, this provides the first high-quality collection of GPCR sequences from a full genome of a non-mammalian species. Our phylogenetic analysis on the curated chicken GPCRs, together with 750 human GPCRs, clarifies the differences between the GPCR repertoires that may relate to the functional differences between these two species. Our curated GPCR dataset from the chicken genome could serve as a basis for annotating this important protein family in other vertebrates, as well as in invertebrates. Materials and Methods Generation of a Genscan dataset. A set of in silico–predicted chicken genes was obtained from the February 2004 genome assembly by the following procedure. The chromosomal files were divided into smaller files of 2 MB. We scanned each of these files with Genscan [84], using the human parameter file HumanIso.smat, because no Genscan dataset was available for download at that point. Thereafter, we gathered all the complete predictions into a unique set of predicted chicken genes. Identification of GPCRs from the Genscan dataset using BLASTP (step 1). An “in house” GPCR dataset consisting of 403 human non-olfactory receptors ([9,15,85–87]; unpublished data) was searched against a database built from the chicken Genscan dataset using BLAST. All hits with an E-value of better than 0.1 were extracted into a temporary file. From this file, all duplicates, based on the Genscan number, were automatically removed using a custom made C++ program (available upon request). From the resulting single-copy set, the sequences that were true GPCRs were extracted by searching those against a database consisting of the entire RefSeq database [88]—all human non-olfactory GPCRs and 347 human olfactory GPCRs [65]. The chicken Genscan sequences that did not hit any of the human GPCRs among the five top hits in a BLASTP search with a cut-off at E = 10 were classified as non-GPCRs and removed. This resulted in 870 putative predicted chicken GPCRs. These were tentatively annotated by searching every hit against a database consisting of all human GPCRs using BLASTP with a cut-off at E = 10, with subsequent naming according to the most significant human hit with a gg (for G. gallus) prefix. During manual editing, we found that many chicken GPCRs had to be built from several (sometimes more than 30) different predictions, which each contained only a small part of the final chicken GPCR sequence. After manual editing, 390 chicken GPCR sequences remained. Identification of GPCRs from the chicken genome using TBLASTN (step 2). A set of chicken GPCRs that were not found in the Genscan dataset was obtained in the following way. The genomic position was identified for 505 putative chicken GPCRs by aligning its sequence to the genome assembly with BLAT 3.0 [89] and defining the highest scoring alignment as the position for each gene. In a similar way, best-in-chicken genome positions were identified for each human GPCR sequence using translated BLAST (TBLASTN) with a cut-off at 1e−6 [90]. The set of putative new chicken GPCRs was then identified as being the positions in the chicken genome that had an alignment with a human GPCR that was not overlapped by any chicken GPCR. The genomic material aligning to the human GPCR was downloaded, and a final version of each of the chicken GPCRs was manually assembled and edited. Identification of missing GPCRs from initial phylogenetic analysis (step 3). An initial phylogenetic analysis was performed as described below, with the longest possible version of all sequences from each family and group, using both neighbor-joining and maximum-parsimony analysis. From these trees, all cases of missing orthologous GPCRs in the chicken, compared with the human, dataset were identified. This resulted in a dataset of human GPCRs consisting of eight Glutamate, 24 Rhodopsin (α), seven Rhodopsin (β), 21 Rhodopsin (γ), 14 Rhodopsin (δ), and three Secretin GPCRs. This dataset, consisting of, in total, 77 human GPCRs, was searched against the sequence of the entire chicken genome using TBLASTN. All hits with an E-value of better than E = 1e−6 were manually compared against the chicken GPCRs that had so far been collected. All new sequences were collected and subjected to manual assembly and verification. Identification of residual GPCRs from the chicken genome (step 4). A set of human GPCRs consisting of one human sequence from each of the Secretin, Glutamate, Adhesion, Taste2, and Frizzled families, together with one sequence from each of the 13 subgroups of the Rhodopsin family [9] were defined. These were combined with all human sequences from the Other group into a dataset of 47 human GPCR sequences. This dataset was searched against the sequence of the entire chicken genome using TBLASTN. All hits with an E-value of better than E = 0.1 were manually compared against the chicken GPCRs that were collected so far by literal word searches using a bash script. Stretches of 20 amino acids from each hit were used. All new sequences were collected and subjected to manual assembly and verification. Manual curation of chicken GPCRs. All tentative chicken GPCRs were manually assembled, corrected, and verified. This was done using EditSeq and MegAlign from the DNASTAR package (DNASTAR, Madison, Wisconsin, United States), EMBOSS [16], ClustalW [91], and the web-based services BLAST [90] and BLAT [89]. For BLAST and BLAT, we also used standalone versions with local databases with these tools. All chicken proteins were edited under the following assumptions. (1) All splice sites are of the canonical (GT-AG) type. (2) The position of the splice sites, and hence the organization of exons and introns, are in general conserved between chicken and human orthologs. (3) When, in a small region, there were several possible splice sites that fulfill points 1 and 2, the one that gave an amino acid alignment most similar to the human ortholog was chosen. (4) The end of each predicted coding region was chosen as the first stop codon in the correct frame in the last exon. (5) The start codon was chosen as the methionine in the correct frame in the first exon that gave the best alignment to the human ortholog. In a few cases, an exon–intron boundary is clearly different between chicken and human and, in these cases, we attempted to identify that exon by translated alignments between the human ortholog and the smallest possible region in the chicken genome that could contain that exon. All genes were corrected and assembled at the DNA level, and in the final step were translated into an amino acid sequence to ensure that the correct reading frame is maintained along the entire coding region. Semi-automatic verification of the dataset. To ensure that all sequences identified in this process were truly identical to the chicken genome, all sequences were aligned against the genome using the Windows version of BLAT 3.0 [89]. The resulting psl-file was parsed using a JAVA program, and only the highest scoring alignment was kept. This table was inspected manually, and all alignments with less than 100% identity were identified and manually inspected in more detail. The majority of these turned out to be alignment errors produced by BLAT, something that occurs relatively frequently [92]. The other sequences were corrected unless (1) the sequence was known before, in which case the GenBank sequence (http://www.ncbi.nlm.nih.gov) was used, or (2) it appeared to be a polymorphic site that changes between assembly versions of the chicken genome. Phylogenetic analysis. The chicken GPCRs were first divided into families and groups by BLAST searches with the sequence against the human GPCR dataset. The accession numbers for all human sequences used for the phylogenetic analysis can be found in Dataset S4. The olfactory receptor sequences are as described originally [65], and hence the sequences in the NCBI database varies for some, as noted in Dataset S4. The sequences were categorized based on the family identity of the first five human hits. A sequence was placed in Other if the sequence hit two different families or groups. The edited and verified chicken and human GPCRs from each group were combined into a FASTA file and aligned using the UNIX version of ClustalW 1.82 [91]. The default alignment parameters were applied. The alignment was bootstrapped 100 times using SEQBOOT from the Win32 version of the Phylip 3.6 package [93], and the same bootstrapped alignment was used for all subsequent calculations. For neighbor-joining trees, protein distances were calculated on the bootstrapped alignments using PROTDIST from the Win32 version of the Phylip 3.6 package to obtain, in total, 100 distance matrixes. The Jones–Taylor–Thornton matrix was used. Trees were calculated on the distance matrixes using NEIGHBOR from the Win32 version of the Phylip 3.6 package, resulting in 100 trees. Majority-rule consensus trees were constructed using CONSENSE from the Win32 version of the Phylip 3.5 package. The trees were plotted using TreeView [94]. Maximum-parsimony trees were calculated from the same bootstrapped alignment as used for distance trees with PROTPARS from the Win32 version of the Phylip 3.6 package. The trees were un-rooted and calculated using ordinary parsimony, and the topologies were obtained using the built-in tree-search procedure. Consensus trees were calculated and plotted as described above. For the maximum-likelihood trees, the topology obtained from the maximum-parsimony or neighbor-joining trees was used as a user-defined tree in TreePuzzle [95], and clock-like branch lengths were estimated in TreePuzzle using the following parameters. Type of analysis: Tree reconstruction; Tree-search procedure: User-defined trees; Compute clock-like branch lengths: Yes; Location of root: Best Place (automatic search); Parameter estimates: Exact (slow); Parameter-estimation uses: 1st input tree; Type of sequence input data: Amino acids; Model of substitution: VT (Mueller–Vingron Model of Substitution, 2000); Amino acid frequencies: Estimate from dataset; Model of rate heterogeneity: Mixed (one invariable plus eight Gamma rates); Fraction of invariable sites: Estimate from dataset; Gamma distribution parameter alpha: Estimate from dataset; Number of Gamma rate categories: eight. Global pairwise alignments. Global pairwise alignments for calculation of percentage identity between two sequences were constructed and scored automatically using a bash-script that utilized ClustalW [91] as alignment engine and infoalign from the EMBOSS 2.8.0 package [16] for scoring, i.e., calculating the percentage of identical amino acids. All statistical analysis was performed using MiniTab (http://www.minitab.com). Graphs were plotted using Microsoft Excel (http://www.microsoft.com) and MiniTab. Dataset S1: The Olfactory Phylogenetic Tree in Standard Newick Format (12 KB TXT) Click here for additional data file.(12K, txt) Dataset S2: Table Indicating Completeness and Protein Name of All Chicken GPCR Identified in This Study. Classification and Step in which They Were Identified in the Search Procedure is Included (87 KB PDF) Click here for additional data file.(88K, pdf) Dataset S3: Amino Acid Sequences of All Chicken GPCRs Identified in This Study in Standard Fasta Format (132 KB PDF) Click here for additional data file.(132K, pdf) Dataset S4: Table with Accession Numbers to All Human Protein Sequences Used in This Analysis (19 KB PDF) Click here for additional data file.(19K, pdf) Table S1: Summary of the Process of Identification of Chicken GPCRs and of the Steps in which Receptors from the Respective Groups Were Identified (59 KB PDF) Click here for additional data file.(59K, pdf) Acknowledgments We would like to acknowledge Thora K. Bjarnadottir for substantial contributions to the in-house human GPCR dataset. Abbreviations
Footnotes Author contributions. MCL, DEG, HBS, and RF conceived and designed the experiments. MCL, ARH, DEG, TPL, and RF performed the experiments. MCL, ARH, DEG, TPL, HBS, and RF analyzed the data and wrote the paper. HBS and RF contributed reagents/materials/analysis tools. Competing interests. The authors have declared that no competing interests exist. Funding. RF was supported by the Swedish Brain Foundation (Hjärnfonden). The studies were supported by the Swedish Research Council, Svenska Läkaresällskapet, the Åke Wikberg Foundation, the Lars Hiertas Foundation, the Thurings Foundation, the Novo Nordisk Foundation, and the Magnus Bergwall Foundation. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nature. 2001 Feb 15; 409(6822):860-921.
[Nature. 2001]Science. 2001 Feb 16; 291(5507):1304-51.
[Science. 2001]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Nature. 2004 Oct 21; 431(7011):931-45.
[Nature. 2004]FEBS Lett. 2005 Jan 31; 579(3):690-8.
[FEBS Lett. 2005]Genome Res. 2001 May; 11(5):817-32.
[Genome Res. 2001]Nature. 2001 Feb 15; 409(6822):860-921.
[Nature. 2001]Science. 2001 Feb 16; 291(5507):1304-51.
[Science. 2001]Mol Pharmacol. 2005 May; 67(5):1414-25.
[Mol Pharmacol. 2005]Proc Natl Acad Sci U S A. 2003 Apr 15; 100(8):4903-8.
[Proc Natl Acad Sci U S A. 2003]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Science. 2002 Oct 4; 298(5591):176-8.
[Science. 2002]BMC Evol Biol. 2005 Jul 15; 5():41.
[BMC Evol Biol. 2005]Mol Pharmacol. 2005 May; 67(5):1414-25.
[Mol Pharmacol. 2005]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Trends Genet. 2000 Jun; 16(6):276-7.
[Trends Genet. 2000]Pharmacol Rev. 2005 Jun; 57(2):279-88.
[Pharmacol Rev. 2005]Genomics. 2004 Jul; 84(1):23-33.
[Genomics. 2004]Gene. 2005 Dec 5; 362():70-84.
[Gene. 2005]Biochem J. 2001 Nov 1; 359(Pt 3):465-84.
[Biochem J. 2001]Cell. 1999 Feb 19; 96(4):541-51.
[Cell. 1999]J Neurochem. 2001 May; 77(3):896-903.
[J Neurochem. 2001]Cell. 2001 Aug 10; 106(3):381-90.
[Cell. 2001]Proc Natl Acad Sci U S A. 2002 Apr 2; 99(7):4692-6.
[Proc Natl Acad Sci U S A. 2002]Mol Biol Evol. 2006 Feb; 23(2):292-300.
[Mol Biol Evol. 2006]J Biol Chem. 1998 Dec 25; 273(52):35008-15.
[J Biol Chem. 1998]Biochim Biophys Acta. 2001 Apr 16; 1518(3):237-48.
[Biochim Biophys Acta. 2001]Cytogenet Cell Genet. 2001; 92(1-2):157-61.
[Cytogenet Cell Genet. 2001]Genomics. 2000 Jul 1; 67(1):8-18.
[Genomics. 2000]EMBO J. 2001 May 1; 20(9):2152-9.
[EMBO J. 2001]Nature. 1998 Dec 17; 396(6712):674-9.
[Nature. 1998]Development. 2003 Oct; 130(19):4501-13.
[Development. 2003]Nature. 1998 May 14; 393(6681):178-81.
[Nature. 1998]Mol Biol Evol. 2005 Mar; 22(3):432-6.
[Mol Biol Evol. 2005]Physiol Genomics. 2003 Jun 24; 14(1):73-82.
[Physiol Genomics. 2003]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Mol Biol Evol. 2006 Feb; 23(2):292-300.
[Mol Biol Evol. 2006]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Vision Res. 1988; 28(4):471-80.
[Vision Res. 1988]FEBS Lett. 1990 Oct 15; 272(1-2):128-32.
[FEBS Lett. 1990]Proc Natl Acad Sci U S A. 1992 Jul 1; 89(13):5932-6.
[Proc Natl Acad Sci U S A. 1992]Science. 1986 Apr 11; 232(4747):193-202.
[Science. 1986]Ann N Y Acad Sci. 2005 Apr; 1040():323-7.
[Ann N Y Acad Sci. 2005]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]J Mol Endocrinol. 2005 Jun; 34(3):793-807.
[J Mol Endocrinol. 2005]J Neurosci. 1996 Sep 15; 16(18):5613-20.
[J Neurosci. 1996]Neuropeptides. 2004 Aug; 38(4):141-51.
[Neuropeptides. 2004]J Mol Evol. 2004 Jan; 58(1):106-14.
[J Mol Evol. 2004]Biochem Biophys Res Commun. 2000 Oct 14; 277(1):264-9.
[Biochem Biophys Res Commun. 2000]Genomics. 2005 Jun; 85(6):688-703.
[Genomics. 2005]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Pharmacol Rev. 2000 Mar; 52(1):145-76.
[Pharmacol Rev. 2000]Dev Comp Immunol. 2000 Dec; 24(8):765-70.
[Dev Comp Immunol. 2000]Immunogenetics. 2003 Mar; 54(12):884-95.
[Immunogenetics. 2003]Immunogenetics. 1999 Sep; 49(10):909-14.
[Immunogenetics. 1999]Fish Shellfish Immunol. 2002 Sep; 13(3):251-8.
[Fish Shellfish Immunol. 2002]J Biol Chem. 2002 Sep 27; 277(39):35826-32.
[J Biol Chem. 2002]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Cell. 2001 Sep 7; 106(5):619-32.
[Cell. 2001]Nat Neurosci. 2002 Mar; 5(3):201-9.
[Nat Neurosci. 2002]Proc Natl Acad Sci U S A. 2003 Oct 14; 100(21):12235-40.
[Proc Natl Acad Sci U S A. 2003]Mamm Genome. 2000 Nov; 11(11):1016-23.
[Mamm Genome. 2000]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Genome Biol. 2001; 2(6):RESEARCH0018.
[Genome Biol. 2001]Mech Dev. 1996 Mar; 55(1):65-77.
[Mech Dev. 1996]Genome Res. 2001 May; 11(5):817-32.
[Genome Res. 2001]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Trends Biochem Sci. 1992 Oct; 17(10):383-7.
[Trends Biochem Sci. 1992]Biochem J. 1996 Dec 1; 320 ( Pt 2)():459-66.
[Biochem J. 1996]Nature. 2004 Oct 21; 431(7011):931-45.
[Nature. 2004]FEBS Lett. 2005 Jan 31; 579(3):690-8.
[FEBS Lett. 2005]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Mol Pharmacol. 2005 May; 67(5):1414-25.
[Mol Pharmacol. 2005]Development. 2003 Oct; 130(19):4501-13.
[Development. 2003]Nature. 1998 May 14; 393(6681):178-81.
[Nature. 1998]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Cell. 1991 Apr 5; 65(1):175-87.
[Cell. 1991]Proc Natl Acad Sci U S A. 2003 Oct 14; 100(21):12235-40.
[Proc Natl Acad Sci U S A. 2003]Mamm Genome. 2000 Nov; 11(11):1016-23.
[Mamm Genome. 2000]Neuron. 1995 Dec; 15(6):1383-92.
[Neuron. 1995]Nature. 2004 Dec 9; 432(7018):695-716.
[Nature. 2004]Curr Opin Neurobiol. 2004 Aug; 14(4):423-7.
[Curr Opin Neurobiol. 2004]Cell. 1999 Feb 19; 96(4):541-51.
[Cell. 1999]J Neurochem. 2001 May; 77(3):896-903.
[J Neurochem. 2001]Mol Biol Evol. 2005 Mar; 22(3):432-6.
[Mol Biol Evol. 2005]Cell. 2000 Mar 17; 100(6):693-702.
[Cell. 2000]Mol Biol Evol. 2005 Mar; 22(3):432-6.
[Mol Biol Evol. 2005]Nat Genet. 2002 Nov; 32(3):397-401.
[Nat Genet. 2002]Biochem Biophys Res Commun. 2004 Jun 25; 319(2):479-85.
[Biochem Biophys Res Commun. 2004]Vision Res. 1988; 28(4):471-80.
[Vision Res. 1988]FEBS Lett. 1990 Oct 15; 272(1-2):128-32.
[FEBS Lett. 1990]Proc Natl Acad Sci U S A. 1992 Jul 1; 89(13):5932-6.
[Proc Natl Acad Sci U S A. 1992]Curr Biol. 2003 Aug 5; 13(15):1269-78.
[Curr Biol. 2003]J Neurosci. 2000 Apr 15; 20(8):2845-51.
[J Neurosci. 2000]J Mol Biol. 1997 Apr 25; 268(1):78-94.
[J Mol Biol. 1997]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Genomics. 2004 Jul; 84(1):23-33.
[Genomics. 2004]FEBS Lett. 2002 Nov 20; 531(3):407-14.
[FEBS Lett. 2002]Biochim Biophys Acta. 2005 Apr 15; 1722(3):235-46.
[Biochim Biophys Acta. 2005]Nucleic Acids Res. 2003 Jan 1; 31(1):34-7.
[Nucleic Acids Res. 2003]Genome Res. 2002 Apr; 12(4):656-64.
[Genome Res. 2002]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]Mol Pharmacol. 2003 Jun; 63(6):1256-72.
[Mol Pharmacol. 2003]Trends Genet. 2000 Jun; 16(6):276-7.
[Trends Genet. 2000]Nucleic Acids Res. 1994 Nov 11; 22(22):4673-80.
[Nucleic Acids Res. 1994]Nucleic Acids Res. 1997 Sep 1; 25(17):3389-402.
[Nucleic Acids Res. 1997]Genome Res. 2002 Apr; 12(4):656-64.
[Genome Res. 2002]Genome Res. 2002 Apr; 12(4):656-64.
[Genome Res. 2002]Biochem Biophys Res Commun. 2005 Jun 17; 331(4):1566-76.
[Biochem Biophys Res Commun. 2005]Genome Biol. 2001; 2(6):RESEARCH0018.
[Genome Biol. 2001]Nucleic Acids Res. 1994 Nov 11; 22(22):4673-80.
[Nucleic Acids Res. 1994]Comput Appl Biosci. 1996 Aug; 12(4):357-8.
[Comput Appl Biosci. 1996]Bioinformatics. 2002 Mar; 18(3):502-4.
[Bioinformatics. 2002]Nucleic Acids Res. 1994 Nov 11; 22(22):4673-80.
[Nucleic Acids Res. 1994]Trends Genet. 2000 Jun; 16(6):276-7.
[Trends Genet. 2000]Comput Appl Biosci. 1996 Aug; 12(4):357-8.
[Comput Appl Biosci. 1996]Genomics. 2004 Jul; 84(1):23-33.
[Genomics. 2004]Comput Appl Biosci. 1996 Aug; 12(4):357-8.
[Comput Appl Biosci. 1996]