Influenza Database and Tools
Entrez Nucleotide Split Database
Third Party Annotation Database
RefSeq Release 18
1918 Killer Flu Virus
GenBank Release 155
Mammoths and Moas at NCBI
Recent NCBI Publications
NCBI Papers Most Cited
Genome Builds and Map Viewer
A Collaborative Effort
The Trace Archive was established in 2001 as a collaborative effort between NCBI and the European Molecular Biology Laboratory (EMBL/ENSEMBL) to collect raw data produced at sequencing centers around the world. Today, these data are submitted to one of two central processing centers—NCBI or the Wellcome Trust Sanger Centre. The amount of data in the archive has doubled every 10 months since 2001 so that it is now an overwhelming 22 trillion bytes in size, large enough to fill a stack of compact disks 10 stories high. New sequencing technologies promise an even sharper increase in data volume in the future. NCBI works closely with the groups pioneering these new techniques to develop the necessary processing , storage and retrieval technologies in advance of the anticipated data influx.
Traces are Pieces of a Puzzle
NCBI’s Trace Archive provides direct access to the raw traces, typically between 300 and 1,000 DNA letters in length.
Researchers can view and evaluate over 850 assemblies, such as that shown in Fig. 1, of trace-derived sequences for influenza virus.
Click on image to view larger
These assemblies are found in the Assembly Archive, a database that builds upon the sequences in the Trace Archive to provide a higher level view.
A Vital Resource in the Fight Against Disease
Sequencing traces are vital to the hunt for polymorphisms in gene sequences that are linked to disease when they occur in human DNA or linked to virulence when they occur in the DNA of a virus. To further support studies of DNA sequence variability, NCBI maintains the core dbSNP database with detailed information for over 25 million genetic variations, predominantly single DNA letter changes called ‘Single Nucleotide Polymorphisms’. The trace data, combined with that of dbSNP, is a boon to medical researchers seeking to gain greater insight into the impact of genetic variation on health. Trace sequences may be searched using MegaBLAST, or via the web-based form at
(see the ‘Mammoth found in Trace Archive’ section of the “Mammoths and Moas. . .” article in this issue.)