Demonstration Exercise:
Find the 3-dimensional structure for a protein of interest or for similar protein
sequences. View residues in active site.
| |
Is there a known (resolved) three-dimensional structure for the protein
encoded by the human MLH1 gene? If not, are there similar protein sequences that
have known structures?
Also, one of the known MLH1 mutations in colon cancer patients is of particular
interest to me (GLY67TRP)? Is that mutation possibly in an active site of the
protein?
|
|
|
| |
|
Demo:
- try an Entrez Structures search for MLH1
- then try a broader search for colon cancer, in case submitters might
have used a different variation of the protein name
- any hits that show our protein of interest?
- there are relatively few resolved structures, compared to the quantity of
protein sequences available, so the chances of finding the exact structure you are
looking for are slim
- in that case, try some back doors to
find structures for proteins that have similar sequences
- for example:
- start with colon cancer (MLH1) protein from RefSeq (NP_000240)
- retrieve related sequences (select that option from the "Links"
menu for NP_000240)
- limit search results to sequences with corresponding 3-D
structures (select structure links from the display menu near the top of the page.)
- in this example, we'll focus on the structure for 1H7U. It was submitted by the same lab that also
submitted 1H7S and 1EAP --
1H7S is the structure of the free protein
1EAP is the structure of the protein when it is bound to ADP
1H7U is the structur of the protein when it is bound to a synthetic form
of ATP.
Since the latter probably most closely approximates the structure of the active
protein, we will focus on that one.
- display structure record summary for 1H7U and then view it in
Cn3D
- align your original query sequence (NP_000240) to the protein chain
A of 1H7U (i.e., 1H7U_A). The 1H7U_B protein chain has an identical sequence and
can be used instead, if desired. Either one will work the same in this
example.
- note the region of sequence similarity is in the N-terminal region,
which contains the HATPase_c domain
- our earlier Entrez search for MLH1 sequence variations identifed an
important mutation (GLY67TRP) in the N-Terminal region of the MLH1 protein
- how can we tell if that mutation might be in the active site of the
HATPase_c domain?
|
|
|
|