| |
Retrieving individual structures (MMDB)
|
|
| |
One of the biggest questions facing new users as they begin using Cn3D
is how and where to get the type of data that the program reads. Cn3D
intentionally does not read PDB-format files directly, but instead uses
NCBI's MMDB
database. Briefly, MMDB takes data from the Protein
Data Bank, parses each PDB file in order to perform extensive validation
and error correction, and stores the information in a more computer-friendly
format. See MMDB's
homepage to learn more about the motivation for this project, and
for documentation on the options and links in MMDB's structure summary
pages.
There are a variety of ways to retrieve structures from MMDB, both directly
and through links in NCBI's Entrez
service. This page describes some of the typical methods a molecular biologist
might use to find protein structures via different types of queries. We
will use as an example the human PTEN protein (Lee
et al., 1999).
|
|
| |
From an Entrez literature search
|
 |
| |
Structure data is integrated into Entrez, so that literature searches
can lead to MMDB structure summaries. The simplest case is to use Entrez
to search structure annotations by keywords, authors, etc. For example,
go to Entrez,
select "Structure" in the "Search" pull-down menu,
then type "PTEN" in the input box and hit "Go" to
do the search. In this case, the result is trivial, as 1D5R is the only
structure that comes up, with a link to the MMDB summary page.
Structure crosslinks also appear for literature searches. Select "PubMed"
in the "Search" pull-down menu, and use "PTEN structure"
as the query. Down in the resulting list of articles is the crystal structure
paper: Lee et al., "Crystal structure of the PTEN tumor suppressor"
(Lee et al., 1999). Note
that there is a "Structure" crosslink on the right - following
this link will lead again to the MMDB
summary page for 1D5R.
From this summary page, select the "Launch Viewer" option,
then click on the "View/Save Structure" button to download the
data and launch Cn3D with the structure - assuming Cn3D is installed
and configured properly as a helper application. You can also use
the "Save File" option to save the downloaded data to disk,
where you can load it into Cn3D manually using the File:Open
dialog. Two windows should appear: the main Cn3D structure window where
the protein is displayed, and a sequence window that shows the protein
chain's amino acid sequence. These should look very much like:
 
If this were a multi-chain protein, there would be several sequences
shown simultaneously in the sequence window.
|
|
| |
From an Entrez sequence neighbor
|
 |
| |
Suppose we didn't already know about the crystal structure, and were
studying diseases linked to PTEN mutations: for example, follow this link
to an article on Cowden disease: Liaw
et al., 1997. Click on the "Protein" crosslink above and
to the right of the abstract, then follow the "O00633" link
to the GenPept
summary of PTEN. On this page is a great deal of information on PTEN
literature and known mutations.
In this case there is no direct link to structure from here, because
the protein used for the crystal structure is not quite the same as the
natural protein that this GenPept report describes. However, one can look
for a sequence with known structure in a list of precomputed GenBank sequence
neighbors to this protein. From the GenPept summary, click on "BLink" in the upper right.
Hit the "3D Structures" button on the top of the resulting page, and see
that the only known related structure is indeed 1D5R, chain A. You can see
the alignment in Cn3D if you click on the little blue dot in the result
line.
|
|
| |
From a BLAST search
|
 |
| |
We can look for a structure based directly on the PTEN sequence by doing
a BLAST search against the PDB. The advantage here is that one can use
any sequence as the query, even a new or proprietary sequence that has
not been deposited in GenBank. Importantly, one can also examine the BLAST
alignment and scores to judge the degree of sequence homology between
query and structure.
There are many ways to do this search, but for this example let's start
with NCBI's BLAST service. First note in the GenPept summary above that
the accession code for PTEN's amino acid sequence is "O00633"
(the first character is a capital letter O, the second two are zeros).
Go to the BLAST
query page, and follow the link to "Standard protein-protein
BLAST [blastp]". Then select
"pdb" in the "Database" menu (to search against known
structures), and type
"O00633" in the input box, and finally hit "BLAST!"
to start the search.
The query will be sent to the BLAST queue, where after a specified time
interval depending on computer load, the results will be available by
clicking the "Format!" button. Just below the graphical
summary of hits is a list of sequences found. The one at the top with
the best score and E-value, at least at the time this document was written,
is the now-familiar PDB entry 1D5R. Following the "pdb|1D5R|A"
link leads to the GenPept
summary for this structure, from which the "Structure" link
on the right leads ultimately to the MMDB summary for 1D5R.
This is a trivial example since the PDB structure found is of exactly
the protein whose sequence was used as the BLAST query. But this is a
very powerful general method for finding structures whose proteins are
related to the query closely enough so that structural properties can
be inferred by homology. See the Alignment chapter
of this document to learn how to display an alignment of the query sequence
with the protein sequence in Cn3D's sequence window.
|
|
| |
From a known PDB identifier
|
 |
| |
If the PDB identifier for a protein is already known, then the most straightforward
way to find a structure is directly from MMDB.
Simply input the four-character PDB code in the input box, and hit "Go".
In the journal article on the crystallization and structure determination
of PTEN (Lee et al., 1999),
we find that the authors have deposited the structure data in the PDB
with the identifier "1D5R". Putting this in the MMDB query box
leads directly to the MMDB
summary for 1D5R.
|
|