PubMed Nucleotide Protein Genome Structure PMC Taxonomy OMIM
 Search for
     
Genome resources

Information
Home
About this site
About viruses
Statistics
FAQs
Advisors
Help

All Viral Genomes
Alphabetical list
RefSeq genomes
Other genomes
RefSeq proteins
RefSeq FTP
Taxonomy groups

All Viroid Genomes
Alphabetical list
RefSeq genomes
Other genomes
Taxonomy groups

Tools
BLAST
PASC
Protein clusters

Related NCBI Resources
Genotyping
Influenza viruses
Retroviruses
SARS-CoV
Taxonomy

Virus Taxonomy
ICTV
ICTV 7th Report

Other Databases and Projects
dsRNA viruses
HCV
HCV(eu)
HIV
Influenza
Plant viruses
Poxviruses
SARS Bioinformatics
Subviral RNA
VIDA
VBCa

Related Sites
All The Virology
Big Picture Book
The Beauty of Viruses
Viruses: From Structure To Biology

   Viral Genomes Help

    How to find a particular viral genome
    How to find all reference genomes available for a viral family
    What is included on the lists of genomes
    How viral genomes are shown
    Additional genome views
    Multicomponent (segmented) genomes
    Global alignments with other complete genomes from public databases
    How to retrieve nucleotide and protein sequences of viral reference genomes
    How to retrieve non-RefSeq nucleotide sequences of complete viral genomes
    Viral COG (VOG) - clusters/groups of related viral proteins

How to find a particular viral genome

Search Entrez Genome. Enter a virus name in the search box at the top and click "Go". For help, read the Entrez Help Document. Alternatively, browse the alphabetical list of all viral genomes. This list is colorcoded by taxonomic groupings.

Complete alphabetical list

How to find all reference genomes available for a viral group, order, family, or floating genus

Enter a taxonomy node name in the following textbox. Then click "Find". Note that abbreviated lineages were used to create viral genome lists; therefore some names may not work in this search.

Alternatively, browse the list of taxonomy groups for which complete genomes are available.

Genomes grouped by taxonomy

Each name in this page is a hyperlink leading to a list of genomes belonging to a corresponding viral group (e.g., "dsDNA viruses"), order (e.g., "Picornavirales"), family (e.g., "Retroviridae"), or floating genus (e.g., "Anellovirus"). The numbers are links to the corresponding NCBI Taxonomy pages.

What is included on the lists of genomes

Each list is a hyperlinked table with the following columns: virus name, source information (such as strain, isolate, serotype, etc.; when available) , the number of genome components (segments), the length in nucleotides, the number of encoded proteins, the date of record creation, and the number of other complete sequences for the species (additional complete genomes or genome segments found in DDBJ/EMBL/GenBank). The hyperlinked arrow on all lists, except the complete alphabetical one, leads to corresponding lists of higher taxonomy rank. The lists for major viral groups are colorcoded by lowerrank taxonomic groupings, similar to the complete alphabetical list.

Complete genomes for the family "Flaviviridae"

How viral genomes are shown

Each virus name shown in italics is hyperlinked to a graphical representation of the genome. The link to the actual genome record is present there in the form of the genome accession number (a number that begins with the letters "NC"). The rightmost column ("Nbrs") leads to global alignments of reference sequences with corresponding other complete genomes or genome segments found in public databases.

Poliovirus genome:

Genome presentation view | Record default view

Additional genome views

To obtain an enhanced graphics view from a Genome record page, change "Default view" on the toolbar option to "Graphics" and click the "Display" button on the left. This option shows additional important features such as protein processing products and allows a detailed view of any part of the genome. Explore other options to take full advantage of this Web site.

Poliovirus genome: Enhanced graphics view

Multicomponent (segmented) genomes

Viral genomes that comprise more than one RNA or DNA component are called multicomponent or segmented genomes. In the lists of viral genomes, the column "Segm" indicates the number of genomic components (segments). The column "Length" shows the total length of all genome components. Similarly, the column "Protein" contains combined numbers of unprocessed proteins annotated for all components of each genome. Clicking on a multicomponent virus name leads to an additional graphic page showing schematic images of the genome components. Links to displays of individual components are located beneath the scheme.

Multicomponent genome of Influenza A virus

Global alignments with other complete genomes from public databases

To compare a viral reference sequence with other (nearly) complete genomes or corresponding genomic segments for the same species (former "Genome neighbors") from DDBJ/EMBL/GenBank, click on the hyperlinked number in the Nbrs column on a family page. The default view is a graphical representation of pairwise alignments. For multicomponent viruses, the alignments for each genome segment are sequentially displayed.

Global alignment of a reference sequence with other complete sequences (sample)

How to retrieve nucleotide and protein sequences of viral reference genomes

Nucleotide or protein sequences of all viral reference genomes can be retrieved from the corresponding Entrez database via the Entrez Nucleotide or Entrez Protein hyperlinks located in the "All Viral Genomes" section of the left side blue bar on the main page or other informational pages (including this one). To retrieve sequences for a particular virus group, use the "Sequence Info" menu on a correspondent group page. For example, the links on the "Flaviviridae" page will bring up the nucleotide or protein sequences belonging exclusively to this virus order. When the desired list of sequences is displayed, select a format (FASTA, ASN.1, XML, etc.) from the option box and click "Display". Then save the result to your local storage device. To further narrow the search, add to an Entrez query one or more specific terms, e.g. "Hepatitis C virus[Organism]" or "(polymerase OR replicase)[Protein Name]" without quotes.

As viral reference sequences are also part of the NCBI RefSeq collection, they can be downloaded via the NCBI RefSeq Web page. The direct link "RefSeq FTP" is located on the left side blue bar. Alternatively, one can use NCBI eUtils.

How to retrieve non-RefSeq (DDBJ/EMBL/GenBank) nucleotide sequences of complete viral genomes

First, retrieve the RefSeq genomes of interest via an Entrez Genome search. Then either chose "Other genomes" in the option box "Display" or use the link "Other genomes for species" located under the menu "Links" to the right from each RefSeq accession found.

Where to find Viral COG (VOG) - clusters/groups of related viral proteins

Use hyperlinks in the table on the VOG start page that is accessible from any page via the link "Protein clusters" on the blue sidebar. [more »]

Back to NCBI Viruses


Revised: November 2, 2007