Updating Information on GenBank Records

The following information provides the different methods to submit updates to GenBank in order to ensure that your update is processed quickly and correctly. Updates provided in an incorrect format will delay processing. All update files should be saved as plain text.  If  you are updating multiple records, please send a list of all accessions to be updated at the top of your request.

Do not submit a new Sequin file to update an existing record. However, Sequin can be used in Network aware mode to download your publicly available records for update as described below.

Update files in the formats below should be emailed directly to gb-admin@ncbi.nlm.nih.gov or uploaded using UpdateMacroSend .

Update Formats:

Editing Source Information

Send updates to the source information (i.e. strain, cultivar, country, specimen_voucher) in a multi-column tab-delimited table, for example:

acc. num.       strain  country	organism
AYxxxx02        82      USA	Escherichia coli
AYxxxx03        ABC     Canada	Bacillus subtilis

Updating Publication Information

Replace any non-ASCII characters (for example, characters with accents and umlauts) with the appropriate English letters. Send updates to the publication information in a multi-column tab-delimited table, for example:

acc. num.	authors	title
FJxxxx01	John A. Smith	Identification of gene A	
FJxxxx02	Xu P. Weng, Jane Doe	Identification of gene B

The complete list of revised author names should be provided in the following format:

given_name middle_initial surname, etc.

These are the valid publication fields which should be used in the column headers:

  • authors
  • journal
  • volume
  • issue
  • pages
  • publication date
  • title
  • affiliation
  • department
  • city
  • state
  • publication country
  • street
  • postal code
  • *PMID
  • **class

All columns may not be appropriate for each reference. Use only the relevant columns. If the reference has been published, include the complete journal title, not an abbreviation.

*If the publication has a PubMed identifier (PMID), it is not necessary to supply any of the remaining publication fields.  It is sufficient to send a table with accession number and PMID only.

**The class descriptor should only be used when the publication status has been changed. This descriptor has a controlled vocabulary and may only include one of the following three class values:

  • unpublished
  • in-press journal
  • journal

Nucleotide Sequence Update

If you are updating the current nucleotide sequence send the complete new sequence(s) in fasta format:

>AYxxxx02
cggtaataatggaccttggaccccggcaaagcggagagac
>AYxxxx03
ggaccttggaccccggcaaagcggagagaccggtaataat 

Please do not send a list of nucleotide changes. Do not include non-IUPAC characters within the sequence. Use n's for unknown nucleotides within the sequence.

Update features on record without annotation

If you are adding annotation to a record that has none, then send us the features in one of the two following formats:

[a] Tab-delimited 5-column Feature table. For example:

>Feature gb|EFxxxxxx|EFxxxxxx
<1      400     gene
                        gene            ENO1
<1      30      CDS
70      300
                        product         enolase
                        note            homodimer
<1      30      mRNA
70      400
                        product         enolase
<1      30      exon
                        number          1
70      400     exon
                        number          2
 
 

[b] Spreadsheet:

accession    Feature    location           product   number    gene   note
EFxxxxxx     CDS        <1..30, 70..300    enolase                    homodimer
             exon       <1..30                       1
             exon       70..400                      2
             mRNA       <1..30, 70..400    enolase   
             gene       <1..400                                 ENO1
 

The first line of the table must be the header line. The first column must be the accession number; the second column must be the Feature type (e.g. CDS); and the third column must be the nucleotide location of the feature. After the first three columns, feature qualifiers may be listed in any order.

Update features on record with annotation

If you are updating many features of a record, let us know, and we can send you a tab-delimited 5-column Feature table with the current annotation for you to edit and return to us. Or you may send us the revised features in a properly formatted spread sheet as shown above.

Network Aware Sequin

You can use Network aware Sequin to download your existing, public record from Entrez and make the necessary changes to that file. Network aware Sequin should not be used for simple updates such as publication changes or source information changes. Mail the .sqn file containing the updated version to: gb-admin@ncbi.nlm.nih.gov or submit to us via UpdateMacroSend .

Write to the Help Desk

Last updated: 2014-03-05T11:29:36-05:00