Conserved Domains and Protein Classification
 
 
 
Search Methods:  Quick Start Guide
 
 

Text Term Search

Retrieve conserved domain records that contain a term(s) of interest (e.g., chloride channel). Enter the terms in the query box at the top of this page or use the Entrez Conserved Domains Database (CDD) home page. See the help document for search tips, including a list of available search fields and examples of their use.  
 

Protein or Nucleotide Query Sequence

Enter a protein or nucleotide query as an accession or GI number (e.g., AAC50285 or 463989), or as a sequence in FASTA format, to identify the protein's conserved domains and therefore its putative function:
Search Database
The text box above provides a short cut to the CD-Search tool, using its default parameters. The help document provides details about database selection and search results. To view/change the default parameters, or to use advanced search options, enter your query directly on the CD-Search home page. (Note: If a sequence of interest is already in the Entrez Protein database, you can simply follow the "Conserved Domains" link for that sequence record to view its pre-computed CD-Search results.)
 
 

Batch of Protein Sequences

The Batch CD-Search tool allows the computation and download of conserved domain annotation for large sets of protein queries. Input up to 100,000 protein query sequences as a list of sequence identifiers and/or raw sequence data, then download output in a variety of formats (including tab-delimited text files) or view the search results graphically. See the help document for additional details, including information on using Batch CD-Search for scripted data downloads.  
 

Direct fetch via UID

Retrieve a conserved domain record directly from the backend database by entering its unique identifier (UID), in the form of an accession (e.g., cd00400) or PSSM ID (e.g., 79359), in the text box below:
(Note: the "text term search" function also allows you to enter either of those unique identifiers (UIDs), but it first searches the Entrez indices for the UID, then retrieves the record. The "direct fetch via UID" option bypasses the Entrez indices and simply retrieves the specified record.)
 
 

Find proteins with similar domain architectures

Enter a protein query as an accession or GI number (e.g., AAC50285 or 463989), or as a sequence in FASTA format, on the Conserved Domain Architecture Retrieval Tool (CDART) page to find other proteins with similar domain architectures. (Note: If a sequence of interest is already in the Entrez Protein database, you can simply select "Domain Relatives" in that sequence record's "Links" menu to find proteins with similar architecture.)  
 
 


Step by step guides showing how to:
 

  Identify putative function of protein query sequence with CD-Search tool  

Thumbnail image of a CD-Search concise display, which shows only the top-scoring conserved domain hits for each region of the query sequence (1CYG_A, Cyclodextrin Glucanotransferase).  Click on image to jump to a larger, annotated version in the CD-Search guide on: How to identify the putative function of a protein sequence with CD-Search.



  Identify amino acids
involved in binding and catalysis  


Thumbnail image of the small triangles displayed in CD-Search results.  The triangles point to specific residues involved in conserved features, such as binding and catalytic sites, as mapped from a conserved domain to the query protein sequence (NP_081086, mouse DNA mismatch repair protein Mlh1). Click on image to jump to a larger, annotated version in the CD-Search help document.



  Find proteins with similar domain architecture using CDART  

Thumbnail image showing the domain relatives for a protein query sequence (NP_081086, mouse DNA mismatch repair protein Mlh1). Domain relatives are protein sequences that contain one or more of the conserved domains found in the query sequence. Click on the image to open the CDART help document for more information about the tool.

 
 
 Revised 18 November 2014