NCBI Logo
NCBI News




In this issue


Open Mass Spectrometry Search Algorithm (OMSSA)

Probe Database Debut

New Structure Link from Protein

BLAST Download Update

New Microbial Genomes in GenBank

Nucleotide Database Splits

NCBI 4-Pack Course

RefSeq Release 14

New Organisms in UniGene

GenBank Passes 100 Gigabases

New BLAST Formatter

Splign Alignment Tool

GenBank Release 150

New Genome Builds

Submission Corner

Masthead





New Genome Builds and Map Viewer Display

Map Viewer Highlights

Four important eukaryotic genome sequence assemblies and annotations are available in the NCBI Map Viewer. Classical model organisms, the zebrafish, the purple sea urchin Strongylocentrodus pupuratus, bread mold Neurospora crassa are displayed along with fungal pathogen Cryptococcus neoformans (also called Filobasidiella neoformans). This broadens the taxonomic diversity of genomes in the Map Viewer; the zebrafish is the first fish genome, and the urchin and the bread mold add new animal and fungal phyla (echinoderms and basidiomycetes).

The zebrafish genome build 1.1 is NCBI's assembly and annotation of the version 4 (Zv4) sequence produced by the Zebrafish Genome Project. This is 5.7X coverage sequence generated by whole genome shotgun and fingerprinted BAC clone sequencing. Genome features, markers, assembly contigs and components are anchored to the 25 zebrafish chromosomes. The zebrafish Map Viewer graphically displays these on the zebrafish genome sequence assembly. Available sequence maps include NCBI contigs (the "Contig" map), the WGS sequences (the "Component" map), and the location of genes, STSs, ESTs, UniGene clusters (the "ugDr" map), and Gnomon predicted gene models. The current annotation places 26,533 genes and their transcripts onto the zebrafish sequence. The zebrafish Map Viewer continues to display the genetic maps (GAT, HS, MGH, MOP, and ZMAP) and radiation hybrid maps (LN54 and T51) that are maintained in collaboration with ZFIN and members of the zebrafish research community.

Purple sea urchin (Strongylocentrodus pupuratus) build 1.1 is NCBI's annotation of the 6X WGS assembly produced by the Human Genome Sequencing Center at the Baylor College of Medicine. Unlike the zebrafish genome, none of the sea urchin assembled contigs are placed on chromosomes. The same sequence features and components that are available for the zebrafish genome can be displayed on the unplaced contigs in the map viewer by searching for markers on the sea urchin Map Viewer page or through the linked sea urchin genome BLAST page. The current build places 20,544 genes and their transcripts on the assembled sequence.

The two new fungal genomes in the Map Viewer are assemblies and annotations provided to the NCBI by the respective sequencing centers. The Neurospora crassa genome (strain OR74A) is a 10X whole genome shotgun sequence produced by the Broad institute. The sequence is anchored to the seven Neurospora chromosomes. The current annotation contains 10,082 genes and their predicted transcripts. The Cryptococcus genome (Cryptococcus neoformans strain JEC21—serotype D) was produced through the collaboration of The Insititute for Genomic Research (TIGR) and Stanford University. This is a 10.5X whole genome shotgun assembly incorporating BAC clone end sequence and is anchored to the 14 Cryptococcus chromosomes. The Cryptococcus build maps 6,617 genes and their predicted transcripts onto the sequence. Available sequence maps in the Map Viewer for both of these fungal genomes include the assembled WGS supercontigs (the "Contig" map), the WGS sequences (the "Component" map), and the location of genes, STSs, ESTs, and UniGene clusters.

Updated Genomes

The dog genome reported in the previous NCBI news (Volume 14, Issue 1) has been updated to build 2.1 which corresponds to the second version of the boxer genome assembly (CanFam2.0) by the Broad Institute and Agencourt Bioscience. The current NCBI build shows 19,907 genes placed on the whole genome shotgun assembly.

Other access to data

In addition to access through the Map Viewer, sequences of the genome assemblies, transcripts, protiens and gene models for zebrafish, sea urchin, Cryptococcus, Neurospora and the dog are available through the NCBI RefSeq database. The WGS assemblies are also available in GenBank under the following accessions: sea urchin, AAGJ00000000; zebrafish, CAAK00000000; Neurospora, AABX00000000; Cryptococcus, AE017341-AE017356; dog, AAEX02000000. GenBank and RefSeq records are available for searching in the Entrez text search system and NCBI's Web BLAST services where they are extensively integrated with other resources and databases.

New chimpanzee comparative maps available

The human, mouse, rat and chimpanzee genomes can now be displayed side-by-side in the Map Viewer. This feature is available in the "Maps and Options" dialog box from the Map Viewer display of any of the four mammalian genomes as shown in the figure. The available maps can be changed to another mammal by choosing from the "Org" pull-down list. These tracks can be added to the current display. The corresponding or syntenic regions in other mammalian genomes are determined from gene homology relationships provided by NCBI's HomoloGene. Figure 1 shows the regions surrounding the transcription factor “foxp2”, popularly known as the “speech and language gene”, for all four mammalian genomes displayed in the human map viewer.

Figure 1: Comparative mammalian gene maps. To: A portion of the "Maps and Options" dialog box showing the organism selection list. Bottom: Map View display of the region surrounding the speech and language gene, FOXP2, in rate, mouse, human and chimpanzee. Lines between maps connect homologous genes as identified in NCBI's HomoloGene resource.

back to previous articleContinue to next article

NCBI News | Fall/Winter 2002 NCBI News: Spring 2003