BankIt Submission Help: Nucleotide FASTA

For submissions, use a text editor (e.g. WordPad, TextEdit) to prepare a file containing the set of nucleotide sequences in FASTA format.

When using a word processing program (e.g. WordPad, Word, TextEdit), be sure to save your file as Plain text or Text document. If you are not sure that the "Save" option in your program will do this for you, use "Save As..."

In WordPad, select "Save As..." from the File menu. In the "Save as type:" pull-down menu, select "Text Document."

In Word, select "Save As..." from the File menu. In the "Save as type:" pull-down menu, select "Plain Text (*.txt)."

Preparing the Nucleotide FASTA File

Each sequence in the set contains a FASTA definition line followed by the raw sequence data.

The definition line for each sequence begins with a ">" followed by a Sequence_ID. The Sequence_ID identifies the same specimen in all the steps of a submission. Sequence_IDs must be unique within the set and may not contain spaces.

For Barcode submissions, in addition to Sequence_ID, the definition line MUST also contain the organism name for the sequence. Although listing the organism name in the FASTA definition line is OPTIONAL for general GenBank submissions, it is recommended that submitters use this format. If organism names are not input as part of their FASTA definition lines, they must be provided as part of the Organism/Modifiers table in the next step.

The organism name must be specified in exactly the format shown in the samples: [organism=Organism name]

Sample definition line and sequences
>Seq1 [organism=Carpodacus mexicanus] C.mexicanus clone 6b actin (act) mRNA, partial cds
CCTTTATCTAATCTTTGGAGCATGAGCTGGCATAGTTGGAACCGCCCTCAGCCTCCTCATCCGTGCAGAA
CTTGGACAACCTGGAACTCTTCTAGGAGACGACCAAATTTACAATGTAATCGTCACTGCCCACGCCTTCG
TAATAATTTTCTTTATAGTAATACCAATCATGATCGGTGGTTTCGGAAACTGACTAGTCCCACTCATAAT
CGGCGCCCCCGACATAGCATTCCCCCGTATAAACAACATAAGCTTCTGACTACTTCCCCCATCATTTCTT
TTACTTCTAGCATCCTCCACAGTAGAAGCTGGAGCAGGAACAGGGTGAACAGTATATCCCCCTCTCGCTG
GTAACCTAGCCCATGCCGGTGCTTCAGTAGACCTAGCCATCTTCTCCCTCCACTTAGCAGGTGTTTCCTC
TATCCTAGGTGCTATTAACTTTATTACAACCGCCATCAACATAAAACCCCCAACCCTCTCCCAATACCAA
ACCCCCCTATTCGTATGATCAGTCCTTATTACCGCCGTCCTTCTCCTACTCTCTCTCCCAGTCCTCGCTG
CTGGCATTACTATACTACTAACAGACCGAAACCTAAACACTACGTTCTTTGACCCAGCTGGAGGAGGAGA
CCCAGTCCTGTACCAACACCTCTTCTGATTCTTCGGCCATCCAGAAGTCTATATCCTCATTTTAC

>Seq2 [organism=Vireo solitarius] Vireo solitaries isolate A actin (ACT) gene, complete cds GGTAGGTACCGCCCTAAGNCTCCTAATCCGAGCAGAACTANGCCAACCCGGAGCCCTTCTGGGAGACGAC CAAATCTACAACGTAGTCGTTACGGCCCACGCCTTCGTAATAATCTTTTTCATAGTAATGCCAATCATAA TCGGAGGATTCGGGAACTGACTAGTTCCTCTAATGATTGGGGCCCCAGACATAGCATTCCCTCGAATAAA CAACATAAGCTTTTGACTACTACCACCATCATTCCTACTCCTAATAGCCTCCTCAACAGTAGAAGCAGGA GCCGGAACCGGATGAACCGTGTACCCACCACTAGCTGGAAACCTGGCCCACGCCGGAGCCTCAGTAGACC TAGCTATCTTCTCCCTACACCTAGCAGGTATCTCATCCATCCTGGGGGCAATTAACTTCATTACAACAGC AATCAACATAAAACCACCCGCCCTCTCACAATACCAAACACCACTATTCGTGTGATCCGTCCTAATTACG GCCGTACTACTCCTACTATCTCTCCCAGTACTAGCCGCCGGTATCACCATGCTACTCACAGACCGCAACC TCAACACCACCTTCTTTGACCCAGCAGGAGGAGGAGACCCAGTACTATACCAGCACCTATTCTGATTCTT CGGACACCCAGAAGTCTACATCCTAATTCTC
Sample nucleotide FASTA
sample file