Information Hubs
Course Home Modules Schedule Exercises Comments Credits

Restrict nucleotide search results to a specific molecule type

  Sample User Question Comments/Analysis Step By Step Guide Additional Tips  

Sample User Question back to
top

 
I'd like to retrieve nucleotide sequences for glyceraldehyde 3 phosphate dehydrogenase, but only want mRNAs in the results.
 

Comments / Analysis back to
top

Researchers often want to retrieve a specific type of molecule, for example, messenger RNA (mRNA) to study the expressed nucleotide sequence, or genomic DNA to study the intron/exon structure or regulatory regions of a gene. Entrez Nucleotide database is a repository of many types of nucleotide sequences. The Limits page of the CoreNucleotide database shows a subset of the most commonly requested molecule types, and the Index of the Properties Field shows the full range of choices.

Step By Step Guide back to top

  1. Entrez Nucleotide - retrieve nucleotide sequence records that contain the term "glyceraldehyde 3 phosphate dehydrogenase"


    • enter glyceraldehyde 3 phosphate dehydrogenase in the search box
      Tip:
      • If you are not sure how you should enter the term (with dashes, without dashes, etc.), you can use the Preview/Index page to browse the index and look at how the terms are represented.
      • Just enter a word stem or word, such as "glyceraldehyde" in the search box near the bottom of the page and press the Index button. (For this example, leave the search field set to the default of All Fields).
      • As you scroll down the index, you will see the various ways in which glyceraldehyde 3 phosphate dehydrogenase has been spelled and misspelled by submitters of sequence records.
      • Select the term(s) of interest from the index and use the AND button to add it to your active query, which is shown in the search box at the top of the page.
    • press Go
    • >thousands of nucleotide sequence records from various molecule types are retrieved

  2. Limits - use the Limits page to restrict the retrieval to mRNAs


    • select the Limits option beneath the search box
    • select mRNA from the Molecule pop-up menu
    • press Go
    • the search results are now reduced by thousands of records

  3. Continue to narrow the search based on the user's need, if appropriate


    • further limit the search in any way desired by the user, such as by organism, source database (e.g., only records from RefSeq), etc.
    • (see also the module slides on narrowing a search and limiting results to records with desired attributes)

Additional Tips back to
top

Complex Boolean query

The sample search was broken up into separate steps for clarity and so you could see how each step further narrows your search results. Once you are familiar with the search techniques, many of the steps can be combined. For example, steps 2 and 3 can be done together. Also, the search can be done in a single step by entering the search as a complex Boolean query. For example:

glyceraldehyde 3 phosphate dehydrogenase[All] AND biomol_mRNA[prop]

Limits Page vs. Properties Field

Also, note that the options shown on the Limits page vary by database. For many Entrez databases, the Limits page shows the most commonly used Limits, not an exhaustive list. For example, it shows only a subset of the molecule type and source database choices that are available. To see the others, browse the index of the Properties field by selecting that field from the lower portion of the Preview/Index page and pressing the Index button.It also allows only a subset of the possible functions.

The Limits page also provides only a limited number of functions. For example, it allows you to exclude certain categories of sequence data, such as STSs, TPAs, working draft high throughput genomic sequences, and patents, but it does not allow you to retrieve only those types of sequences. The exercise on how to retrieve only patent sequences provides an example of how to do the latter.


Information Hubs Return to Slides (*.html or *.mht format)
Return to Exercises List
Revised 07/26/2007