![]() |
![]() |
| Plasmodium
|
Third Party Annotation Database Debuts at GenBank As the amount of publicly available sequence data rapidly increases, third party annotation will become increasingly important. The Third Party Annotation (TPA) database, created by GenBank and its international partners DNA Data Bank of Japan (DDBJ) and European Bioinformatics Institute (EBI), accepts third party annotation of genomic sequence, or computationally derived/predicted sequences. TPA submissions must use sequence data that is already represented in GenBank, and the analysis upon which the annotations are based must appear in a peer-reviewed scientific journal. Those wishing to add a feature annotation, such as a gene, to an unannotated genomic sequence or, wanting to combine two or more records, such as a set of ESTs, to create a longer transcript sequence, can submit their analysis or assembly to the TPA database. Trace data sequences or Whole Genome Shotgun (WGS) may be used as the basis of a TPA submission, but data from secondary sources such as NCBI Reference sequences or primary data from proprietary databases may not be used. Third parties can submit annotations using either Sequin or BankIt. If using BankIt, choose “NO” when asked whether the submission is primary data in order to initiate the TPA option. Those making TPA submissions via Sequin should indicate this in their email message to NCBI and provide accession numbers for the primary sequence(s) used in their analysis. Instructions for making TPA submissions are found at:
TPA records can be located with Entrez using the TPA term within the Properties field; for example:
As of November 2002, there were 104 TPA records in the Entrez database. An example of a TPA record is shown below. The “Primary” field shows how the sequence in the TPA record was constructed from existing database sequences. In the case below, four GenBank database sequences were combined to produce the sequence upon which the submission is based. For instance, bases 1 through 503 in the TPA sequence were derived from bases 3 through 505 in GenBank sequence AQ655575.1.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||