Information Hubs
Course Home Modules Schedule Exercises Comments Credits

Find loci involved in a given biological process

  Sample User Question Comments/Analysis Step By Step Guide Additional Tips  

Sample User Question back to
top

 
I'd like to retrieve a list of human loci that are involved in DNA mismatch repair.
 

Comments / Analysis back to
top

Submitters of sequence records sometimes use varying terminologies to describe the function of gene products. Therefore, search results of the Entrez Nucleotide and Protein databases can be variable, depending upon the terms that were included in a database query.

In contrast, Entrez Gene records include Gene Ontology (GO) terms whenever available. A user can search the Gene Ontology field of Entrez Gene to retrieve records that have been annotated with GO terms.

The Gene Ontology (GO) Consortium is a collaborative effort to address the need for consistent descriptions of gene products. The GO collaborators are developing three structured, controlled vocabularies (ontologies) that describe gene products in terms of their associated biological processes, cellular components and molecular functions in a species-independent manner.

There is also an ongoing Gene Ontology Annotation (GOA) project to annotate the gene products of all organisms with a controlled vocabulary of biological functions, molecular processes, and cellular components. The GO terms that appear in Entrez Gene records have been assigned by the GOA project.

Step By Step Guide back to top

  • open Entrez Gene - retrieve all human loci annotated with the GO term "mismatch repair"


    1. select the Preview/Index option beneath the search box.
      At the bottom of the Preview/Index page:
    2. select the Organism field from the pop-up menu and enter human in the text box next to the search field menu.
      Then press the AND button to add that term and search field to the active query at the top of the page
    3. select the Gene Ontology field from the pop-up menu and enter "mismatch repair" (with quotes) in the adjacent text box.
      Press the AND button to add that term to the active query.
    4. press Go

Additional Tips back to
top

There are four different ways you can search an individual Entrez database, as decribed in the module slides. The example above demonstrates the advanced #2 method. Below are the equivalent complex Boolean query plus tips on use of quotes for phrase searching, searching All Fields vs. Gene Ontology field, browsing the Gene Ontology field index, and GO term definitions and hierarchical classifications.

Complex Boolean query

The search can be done in a single step by entering the search as a complex Boolean query. For example:

human[orgn] AND "mismatch repair"[GO]

Use of Quotes for phrase searching in Entrez Gene

Currently, Entrez Gene currently requires that phrases be enclosed in quotes. For example, a search for:

"mismatch repair"
will ensure that records retrieved by the search contain that exact phrase. In contrast, if the phrase is not surrounded in quotes (and no field specifier is included in the query), Entrez will apply a default Boolean AND between the terms, so the search will be translated to mismatch AND repair. However, if the Gene Ontology field is searched directly, the phrase can be entered with or without quotes.

Search All Fields vs. Gene Ontology field

A basic search (without any field restriction) of Entrez Gene for a biological process will search All Fields by default and might therefore retrieve extranneous hits that mention the search terms in a tangential context. Limiting the search to the Gene Ontology field will only retrieve records to which the controlled GO vocabulary has been applied. That will elimiate extranneous hits, but it will also omit records that have not yet been annotated with GO terms.

Compare the results of the following two searches. You can use the History function to see the records that are retrieved by the first search but not the second search, and the browser's "Edit/find in page" function to view the search terms in context within each record:

human[orgn] AND "mismatch repair"[All]
human[orgn] AND "mismatch repair"[GO]

Browse Index of Gene Ontology field

If you are unsure which GO term to enter in step 3, you can use the Index function to browse the index of the Gene Ontology field. Just select that field from the pop-up menu at the bottom of the Preview/Index page and press the Index button. Leave the adjacent text box blank if you'd like to start browsing from the first term in the index, or enter a word stem (e.g., mismat) before pressing the Index button if you want to jump to a specific part of the index.

If you find several GO terms that are of interest and would like to know their definitions and hierarchical classifications, see the last tip below.

GO term definitions and hierarchical classifications

The Gene Ontology field of Entrez Gene shows several terms that might be of potential interest to the user, such as DNA repair and mismatch repair. If the user is unsure which term(s) to include in the query, it might be helpful to first check the Gene Ontology web site. It has a searchable database of GO terms and displays their definitions as well as text and graphical views showing their classification within the GO heirarchy. Using that site, for example, the user can see that "DNA repair" is a broad term that includes "mismatch repair" and several other repair processes. The term(s) that should be included in the query depends upon whether the user would like broad or narrow retrieval.


Information Hubs Return to Slides (*.html or *.mht format)
Return to Exercises List
Revised 08/03/2007