Transitioning from LocusLink to Entrez Gene
Cancer Chromosomes: a New Entrez Database
HomoloGene: An Entrez Database with a New Look
BLAST Link (BLink) to Protein Alignments and Structures
Debut of the HCT Database and Anthropology/Allele Frequencies in dbMHC
350kb Sequence Length Limit Removed by Sequence Database Collaboration
New Eukaryotic Genomes at NCBI
HIV Protein-Interaction Database
e-PCR and Reverse e-PCR: Greater Sensitivity, More Options
New Organisms in UniGene
RefSeq Accession Numbers Get Longer as Rat Gets Last 6-digit Accession
Slots available for FieldGuidePlus Training Course Onsite at NCBI
RefSeq Release 6 on FTP Site
Exponential Growth of GenBank Continues with Release 142
Entrez Tools is a 'Hot Spot'
BLAST Lab: Using BLASTClust
New Microbial Genomes in GenBank
Environmental Samples Make Big Splash
The technology of Whole Genome Shotgun (WGS) sequencing is now being applied to quickly assemble large sets of genomic sequences taken from organisms inhabiting a particular ecological niche. Sequence data collected in this manner provides a snapshot of the genetic diversity existing at a particular locale and is especially important in providing data for organisms which are difficult or impossible to culture in the laboratory. Recently, The Institute for Biological Energy Alternatives sampled water from the Sargasso Sea, one of the most well-characterized regions of the world's oceans. The larger of two sets of samples collected produced over 1.3 gigabases of sequence in the form 1.66 million WGS reads. These reads were assembled into contigs containing about 1 gigabase of non-redundant sequence. In addition, over 1 million protein sequences were derived from the annotation of open reading frames on the genomic sequences. Contigs constructed from the WGS reads and the remaining single reads have been deposited in the WGS division of GenBank, under the project accession number AACY01000000. Scaffolds assembled from these contigs are available within the accession ranges CH004436-CH004736, and CH004737-CH236877. The raw sequencing data is available in the Trace Archive.
The Sargasso Sea dataset along with other environmental sample datasets, such as sequences from an acid mine drainage biofilm submitted by the DOE Joint Genome Institute, can be queried using the new “Environmental Samples” BLAST page at:
Environmental sample data can also be searched using two newly-created standard BLAST databases, “env_nt” or “env_nr” for nucleotide and protein sequences respectively. The environmental sample data contained within these two new databases is no longer contained within the “nt” or “nr” BLAST databases.
Venter, J.C., et.al., Environmental Genome Shotgun Sequencing of the Sargasso Sea, Science, 2004 Apr 2;304(5667):66-74. PMID 15001713
Tyson, G.W., et.al., Community structure and metabolism through reconstruction of microbial genomes from the environment, Nature, 2004 Mar 4;428(6978):37-43. PMID 14961025