NCBI logo
Structures
module of the MLA course on Introduction to Molecular Biology Information
Resources
Course Home Modules Schedule Exercises Comments Credits
Slide 1 Previous Next Slide List

Demonstration Exercise:
Find the 3-dimensional structure for a protein of interest or for similar protein sequences. View residues in active site.

 
Is there a known (resolved) three-dimensional structure for the protein encoded by the human MLH1 gene? If not, are there similar protein sequences that have known structures?

Also, one of the known MLH1 mutations in colon cancer patients is of particular interest to me (GLY67TRP)? Is that mutation possibly in an active site of the protein?
 

 

Demo:

  • try an Entrez Structures search for MLH1
  • then try a broader search for colon cancer, in case submitters might have used a different variation of the protein name
  • any hits that show our protein of interest?
  • there are relatively few resolved structures, compared to the quantity of protein sequences available, so the chances of finding the exact structure you are looking for are slim

  • in that case, try some back doors to find structures for proteins that have similar sequences

  • for example:
    • start with colon cancer (MLH1) protein from RefSeq (NP_000240)
    • retrieve related sequences (select that option from the "Links" menu for NP_000240)
    • limit search results to sequences with corresponding 3-D structures (select structure links from the display menu near the top of the page.)

  • in this example, we'll focus on the structure for 1H7U. It was submitted by the same lab that also submitted 1H7S and 1EAP --
    1H7S is the structure of the free protein
    1EAP is the structure of the protein when it is bound to ADP
    1H7U is the structur of the protein when it is bound to a synthetic form of ATP.
    Since the latter probably most closely approximates the structure of the active protein, we will focus on that one.

    • display structure record summary for 1H7U and then view it in Cn3D
    • align your original query sequence (NP_000240) to the protein chain A of 1H7U (i.e., 1H7U_A). The 1H7U_B protein chain has an identical sequence and can be used instead, if desired. Either one will work the same in this example.
    • note the region of sequence similarity is in the N-terminal region, which contains the HATPase_c domain
    • our earlier Entrez search for MLH1 sequence variations identifed an important mutation (GLY67TRP) in the N-Terminal region of the MLH1 protein
    • how can we tell if that mutation might be in the active site of the HATPase_c domain?

 

    Answer

Structures
Slide 1 Previous Next Slide List
Revised 11/06/2007