NCBI Logo NCBI News Masthead

In this issue

Using TaxPlot to
Compare Genomes

New RefSeq Accession
Numbers for Curated
Genomic Regions


GenBank News

Recent Publications

DART Targets
Protein Domains

Evidence Viewer
Facilitates Analysis
of NCBI Human
Gene Models

Frequently Asked
Questions

BLAST Lab

Masthead

DART Targets Protein Domains

NCBI has strengthened its suite of protein structural analysis tools by introducing the Domain Architecture Retrieval Tool, or DART. Beginning with a protein sequence, DART facilitates searches of the Entrez protein database for protein domain combinations, or architectures.

DART works by first determining the domain architecture of a protein sequence query. It then displays the domain architectures of other proteins that share at least one domain with the query. Figure 1 shows the initial DART report for E. coli DNA polymerase I, an enzyme with three distinct domains—a 5'arrow3' exonuclease domain, a proofreading 3'
arrow5' exonuclease domain, and a DNA polymerase domain. These three domains of the query protein are identified at the top of the report graphic. Below the query architecture graphic is a list of other domain architectures that share at least one domain with the query, with links to the proteins that have these architectures.

Figure 1: DART display for E. coli DNA polymerase I.

Figure 1: DART display for E. coli DNA polymerase I.


Figure 2: List of domains shown in DART display of Figure 1

Figure 2: List of domains shown in DART display of Figure 1. Checking the box beside a domain constrains subsequent DART searches to seek this domain.


Following the graphic output is a list of domain identifiers and descriptions, as shown in Figure 2. One or more domains can be selected for use as a query for a second round of searching. Hence, the first round search seeks any of the domains found in the query, while subsequent rounds can be tailored to find selected domains.

DART displays can also be limited to architectures found in a single organism, taxonomic class, or combination of organisms or classes by using a prunable taxonomic tree. An interesting observation in the output shown in Figure 1 is that the E. coli DNA polymerase I query picks up the human WRN helicase protein due to the presence of a 3'
arrow5' exonuclease domain at the WRN helicase N-terminus. Looking at the structure of the WRN helicase gene using the NCBI Evidence Viewer (see accompanying article, Figure 1), one can see that there is a 5' exon cluster separated from the rest of the gene by an intron. In fact, abstracting the protein sequence coded in the exons of this cluster by selecting and copying from the Evidence Viewer report, then performing a Conserved Domain Search (www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi), reveals that this 5' cluster of exons codes for the 3'arrow5' exonuclease domain module. Hence, the modular structure of the gene parallels the modular structure of the protein.


Continue Link


NCBI News | Fall 2001 NCBI News