BLAST: User Question and Answer
Course Home Modules Schedule Exercises Comments Credits
Problem Summary:

Identify the Conserved Domains in a Protein

  Sample User Question
Analysis/Comments
Flow Chart
Additional Notes
 

Sample User Question back to top

 
What conserved domains are found in the human MLH1 protein sequence below (from NP_000240)? What is the function of each one?
MSFVAGVIRRLDETVVNRIAAGEVIQRPANAIKEMIENCLDAKSTSIQVIVKEGGLKLIQIQDNGTGIRK
EDLDIVCERFTTSKLQSFEDLASISTYGFRGEALASISHVAHVTITTKTADGKCAYRASYSDGKLKAPPK
PCAGNQGTQITVEDLFYNIATRRKALKNPSEEYGKILEVVGRYSVHNAGISFSVKKQGETVADVRTLPNA
STVDNIRSIFGNAVSRELIEIGCEDKTLAFKMNGYISNANYSVKKCIFLLFINHRLVESTSLRKAIETVY
AAYLPKNTHPFLYLSLEISPQNVDVNVHPTKHEVHFLHEESILERVQQHIESKLLGSNSSRMYFTQTLLP
GLAGPSGEMVKSTTSLTSSSTSGSSDKVYAHQMVRTDSREQKLDAFLQPLSKPLSSQPQAIVTEDKTDIS
SGRARQQDEEMLELPAPAEVAAKNQSLEGDTTKGTSEMSEKRGPTSSNPRKRHREDSDVEMVEDDSRKEM
TAACTPRRRIINLTSVLSLQEEINEQGHEVLREMLHNHSFVGCVNPQWALAQHQTKLYLLNTTKLSEELF
YQILIYDFANFGVLRLSEPAPLFDLAMLALDSPESGWTEEDGPKEGLAEYIVEFLKKKAEMLADYFSLEI
DEEGNLIGLPLLIDNYVPPLEGLPIFILRLATEVNWDEEKECFESLSKECAMFYSIRKQYISEESTLSGQ
QSEVPGSIPNSWKWTVEHIVYKALRSHILPPKHFTEDGNILQLANLPDLYKVFERC
   
 

Analysis/Comments back to top

Conserved Domain, or CD, refers to a distinct functional and/or structural unit of a protein that has been conserved during evolution. During evolution, changes at specific positions of an amino acid sequence in the protein have occurred in a way that preserve the physico-chemical properties of the original residues, and hence the structural and/or functional properties of that region of the protein.

Identifying the conserved domains that exist in a protein can shed light on the protein's function.

Flow Chart back to top

Each protein sequence in Entrez has been compared against NCBI's Conserved Domain Database (CDD). So you can follow the "Domains" link for a record in Entrez Proteins to see the conserved domains that have been identified in the sequence.

This traverses to the CDD and, if "Details" are viewed, shows the presence of the HATPase and DNA mismatch repair domains. In addition, the grey "MUTL" bar represents the protein family with which NP_000240 is associated. Clicking on the graphic for any domain or protein family leads to more detailed information. The "Show Domain Relatives" option retrieves protein sequences with similar domain architectures identified by the Conserved Domain Architecture Retrieval Tool (CDART).

Or, you can use the CD-Search directly to identify the conserved domains in the human MLH1 protein, NP_000240. Sample output below, but try it live and explore the various links in the search output.

 
 

Additional Notes back to top

This exercise is also narrated as part of Entrez tutorial:
  • Geer RC, Sayers EW. 2003. Entrez: making use of its power. Brief Bioinform., 4(2):179-84 (June). PMID: 12846398
    The Entrez Tutorial page provides a brief summary of the article and a link to the full text *.pdf file.
Please note that the search results (number of hits) noted in the article reflect the data that were available as of March 2003. The number of search hits will change as the databases grow, but the general search concepts will continue to apply.


BLAST User Question Return to Slides Revised 11/05/2007
Return to Colon Cancer Umbrella Page