Non-Human Genomes
Course Home Modules Schedule Exercises Comments Credits

Identifying a gene in a eukaryotic DNA sequence via computation

  Sample User Question
Step By Step Guide
 

Sample User Question back to top

Is there a defined gene in the region of Chromosome 3 DNA sequence between 36994604-37053962 from the Homo sapiens genome?

Step By Step Guide back to top

Yes! However, as this is a eukaryotic sequence, the ORF finder will produce results that are hard to interpret—remember, most eukaryotic genes consist of Short coding regions (EXONS) with interspersed non-coding DNA (INTRONS), among other features that may ”confuse“ the ORF Finder (try this with the FASTA file of the region provided on your desktop to see the problem firsthand). Instead use the GeneMachine, which is designed for this purpose-finding eukaryotic genes:

  1. Cut and paste the provided FASTA file of the chromosome region sequence data into the Web-based version of GeneMachine at http://genemachine.nhgri.nih.gov/index.cgi, enter a valid e-mail address and then click on ”Find Genes“.
  2. While waiting for the e-mail with the GeneMachine output, install Sequin on your machine as per the instructions on the GeneMachine site.
  3. The resulting GeneMachine file (which is in ASN.1 format) as read in Sequin indicates the existence of a single gene (FIG 2).
  4. BLASTing the sequence within that single gene region against the human genome indicates that the sequence matches MLH1-the human version of MutL1 (FIG 3)…
GeneMachine Results

FIG 2: GeneMachine Results

BLASTn results showing identity of gene

FIG 3: BLASTn results showing identity of gene


Non-Human Genomes Return to Slides (*.html or *.mht format)
Return to Exercises List
Revised 07/23/2007