NCBI Logo
NCBI News





In this issue

Entrez Programming Utilities (E-Utils)

PubChem

GenePlot

New NLM Catalog in Entrez

New Genome Builds

New Microbial Genomes in GenBank

Whole Genome Shotgun Project

Web BLAST

Trace Archive Grows

New Organisms in UniGene

RefSeq Version 8

Submissions Corner

Predicted Records

GenBank Release 144

BLAST 2.2.10

Publications

Masthead




Sequin Enhancements

Sequin is NCBI's primary GenBank® submissions and update tool that can handle large genomes, population and phylogenetic sets, single sequence submissions, Third Party Annotation (TPA) submissions, and updates to records already in GenBank.

A number of enhancements have been made recently to Sequin that make it easier for a user to submit sequences to GenBank.

TPA submissions require several elements that non-TPA submissions do not, such as an explanation of the experimental evidence for the annotation and the primary accession numbers of GenBank records on which the TPA is based. Sequin's self-guided tabs direct the TPA submitter to indicate that the submission is a TPA and a pop-up frame reminds the submitter that in order to be released, TPA records require a publication which describes the biological experiments used as evidence for the annotation. A box beneath the reminder is provided for the submitter to list the evidence and the experiments supporting the submission. The submitter is then returned to the main menu to continue the submissions process. After the sequence file has been imported into Sequin, an "Assembly Tracking" menu appears which allows entry of the primary accession numbers of the sequences used in the TPA.

The Annotate menu offers an updated definition line generator that places the organelle in which a sequence is located at the end of the definition line.
Sequin also has an enhanced alignment reader. Submitters can now specify which characters within an alignment are meant to designate gaps, ambiguities, and match (identical) characters. Different characters can be specified for the Beginning, Middle, and End gap characters. If the alignment used for submission is not valid, errors will be reported to the submitter indicating the specific problem and suggesting possible solutions.

Once an alignment such as a population or phylogenetic set is loaded, users can view all the nucleotides for all the sequences by selecting Alignment from the Format option. Formerly, Sequin presented a graphical view of the alignment. Using this alignment view, and targeting the different sequences, users can study the differences among the sequences in the alignments.

Further extending Sequin's batch processing capability is a 'Batch Feature Apply' option, found under the Annotate menu that allows the annotation of a batch of sequences with global features such as coding regions or source qualifiers. When ‘Batch Feature Apply’ is selected, a list of various feature types is presented that can be applied to all sequences. The user may choose to have each feature span the entire sequence to which it is applied or the user may specify the left and right ends of the locations for all features.

Finally, enhancements have been made to Sequin's robust editing capabilities. The Update Sequence function has been enhanced to allow the specification of actions to be taken regarding coding regions and references when updating the sequence. If Update Proteins for Updated Sequences is selected, then Sequin will attempt to adjust the locations of coding regions on the updated sequence based on an alignment between the old translated protein and the translation of the updated sequence. Options are also available to truncate retranslated proteins at stops, extend retranslated proteins without stops or extend retranslated proteins without starts. The "Correct CDS" genes function adjusts the corresponding gene span based on the new coding region span.

—MR

 

Continue to: RefSeq

NCBI News | Fall/Winter 2002 NCBI News: Spring 2003