|
|
 |
 
DART
Targets Protein Domains
NCBI
has strengthened its suite of protein structural analysis tools by introducing
the Domain
Architecture Retrieval Tool, or DART. Beginning with a protein sequence,
DART facilitates searches of the Entrez protein database for protein domain
combinations, or architectures.
DART works by first determining the domain architecture of a protein sequence
query. It then displays the domain architectures of other proteins that
share at least one domain with the query. Figure 1 shows the initial DART
report for E. coli DNA polymerase I, an enzyme with three distinct
domainsa 5' 3'
exonuclease domain, a proofreading 3' 5'
exonuclease domain, and a DNA polymerase domain. These three domains of
the query protein are identified at the top of the report graphic. Below
the query architecture graphic is a list of other domain architectures
that share at least one domain with the query, with links to the proteins
that have these architectures.

Figure
1: DART display for E. coli
DNA polymerase I.

Figure
2: List of domains shown in DART
display of Figure 1. Checking the box beside a domain constrains subsequent
DART searches to seek this domain.
Following the graphic output is a list of domain identifiers and descriptions,
as shown in Figure 2. One or more domains can be selected for use as a
query for a second round of searching. Hence, the first round search seeks
any of the domains found in the query, while subsequent rounds can be
tailored to find selected domains.
DART displays can also be limited to architectures found in a single organism,
taxonomic class, or combination of organisms or classes by using a prunable
taxonomic tree. An interesting observation in the output shown in Figure
1 is that the E. coli DNA polymerase I query picks up the human
WRN helicase protein due to the presence of a 3' 5'
exonuclease domain at the WRN helicase N-terminus. Looking at the structure
of the WRN helicase gene using the NCBI Evidence Viewer (see accompanying
article, Figure 1), one can see that there is a 5' exon cluster separated
from the rest of the gene by an intron. In fact, abstracting the protein
sequence coded in the exons of this cluster by selecting and copying from
the Evidence Viewer report, then performing a Conserved Domain Search
(www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi),
reveals that this 5' cluster of exons codes for the 3' 5'
exonuclease domain module. Hence, the modular structure of the gene parallels
the modular structure of the protein.

|