< EFetch Utility for Entrez Sequence and other Molecular Biology Databases
Entrez PubMed Nucleotide Protein Genome Structure OMIM PMC Journals Books

EFetch for Sequence and other Molecular Biology Databases
Last Updated: January 11, 2005

EFetch documenation is also available for the Literature, and Taxonomy databases.

EFetch:  Retrieves records in the requested format from a list of one or more unique identifiers.


Base URL: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?


URL parameters: (NOTE:Utility parameters may be case sensitive. Use lower case characters for all parameters except WebEnv.)


Database (db)
  • db=
  • Current database values:


    Web Environment: History link value previously returned in XML results from ESearch and used with EFetch in place of primary ID result list.

  • WebEnv=WgHmIcDG], etc.
  • Query_key: The value used for a history search number or previously returned in XML results from Esearch or EPost.

  • query_key=6

  • Note: WebEnv is similar to the cookie that is set on a user's computers when accessing PubMed on the web.  If the parameter usehistory=y is included in an ESearch URL both a WebEnv (cookie string) and query_key (history number) values will be returned in the results. Rather then using the retrieved PMIDs in an ESummary URL you may simply use the WebEnv and query_key values to retrieve the records. WebEnv will change for each ESearch query.


    Tool: A string with no internal spaces that identifies the resource which is using Entrez links (e.g., tool=flybase). This argument is used to help NCBI provide better service to third parties generating Entrez queries from programs. As with any query system, it is sometimes possible to ask the same question different ways, with different effects on performance. NCBI requests that developers sending batch requests include a constant 'tool' argument for all requests using the utilities.

  • tool=

  • E-mail Address: If you choose to provide an email address, we will use it to contact you if there are problems with your queries or if we are changing software interfaces that might specifically affect your requests. If you choose not to include an email address we cannot provide specific help to you,  but you can still sign up for utilities-announce to receive general announcements.

  • email=

  • Sequence Databases

    Record Identifier: IDs required if WebEnv is not used.

  • id=123,U12345,U12345.1,gb|U12345|
  • Current values:


    Display Numbers:

  • retstart=x  (x= sequential number of the first id retrieved - default=0 which will retrieve the first record)
  • retmax=y  (y= number of items retrieved)

  • Sequence Strand, Start, Stop and Complexity Parameters
    strand= what strand of DNA to show (1=plus or 2=minus)
    seq_start+ show sequence starting from this base number
    seq_stop= show sequence ending on this base number
    complexity= gi is often a part of a biological blob, containing other gis

    Complexity regulates the display:


    Retrieval Mode:

    Current values:


    Retrieval Type:

  • rettype=output types based on database
  • Current values and descriptions: Type descriptions:
    rettype scope Descriptions
    native (full record) all but Gene Default format for viewing sequences.
    fasta sequence only FASTA view of a sequence.
    gb nucleotide sequence only GenBank view for sequences, constructed sequences will be shown as contigs (by pointing to its parts).
    gbc nucleotide sequence only INSDSeq structured flat file.
    gbwithparts nucleotide sequence only GenBank view for sequences, the sequence will always be shown.
    est dbEST sequence only EST Report.
    gss dbGSS sequence only GSS Report
    gp protein sequence only GenPept view
    gpc protein sequence only INSDSeq structured flat file.
    seqid sequence only To convert list of gis into list of seqids.
    acc sequence only To convert list of gis into list of accessions
    chr dbSNP only SNP Chromosome Report. 
    flt dbSNP only SNP Flat File report. 
    rsr dbSNP only SNP RS Cluster report. 
    brief dbSNP only SNP ID list. 
    docset dbSNP only SNP RS summary. 

    Not all Retrieval Modes are possible with all Retrieval Types.

    Sequence Options:
     
    native fasta gb gbwithparts est gss gp seqid acc gbc gpc
    xml x x* n/a n/a TBI TBI n/a TBI TBI x x
    text x x x* x* x* x* x* x x n/a n/a
    html x x x* x* x* x* x* x x n/a n/a
    asn.1 x n/a n/a n/a n/a n/a n/a x n/a n/a n/a

    x = retrieval mode available
    *  - existence of the mode depends on gi type
    TBI - to be implemented (not yet available)
    n/a - not available
     

    Examples:

    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=5

    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=5&complexity=0&rettype=fasta

    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=5&rettype=gb&seq_start=1&seq_stop=9

    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.cgi?db=nucleotide&id=5&rettype=fasta&seq_start=1&seq_stop=9&strand=2

    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=5&rettype=gb

    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=popset&id=12829836&rettype=gp

    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=8&rettype=gp

    Entrez display format GBSeqXML:
    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=5&rettype=gb&retmode=xml
    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=8&rettype=gp&retmode=xml

    Entrez display format TinySeqXML:
    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=5&rettype=fasta&retmode=xml

    Entrez Gene, full display as xml:
    http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=gene&id=2&retmode=xml

     
     Write to the Help Desk
    NCBI | NLM | NIH
    Department of Health & Human Services
    Freedom of Information Act | Disclaimer