Each name in this page is a hyperlink leading to a list of genomes belonging to a corresponding viral
group (e.g., "dsDNA viruses"), order (e.g., "Picornavirales"), family (e.g., "Retroviridae"), or floating genus (e.g., "Anellovirus"). The numbers are links to the corresponding NCBI Taxonomy pages.
What is included on the lists of
genomes
Each list is a hyperlinked table with the following columns: virus name, source information (such as strain,
isolate, serotype, etc.; when available)
, the number of genome components (segments), the length in nucleotides, the number
of encoded proteins, the date of record creation, and the number of other complete sequences for the
species (additional complete genomes or genome segments found in DDBJ/EMBL/GenBank). The hyperlinked arrow on all lists, except the complete alphabetical one, leads to corresponding lists of higher taxonomy rank. The lists for major viral groups are colorcoded by lowerrank taxonomic groupings, similar to the complete alphabetical list.
Complete genomes for the family "Flaviviridae"
Multicomponent
(segmented) genomes
Viral genomes that comprise more than one RNA or DNA
component are called multicomponent or segmented genomes. In the lists of viral genomes, the column "Segm" indicates the number of
genomic components (segments). The column "Length" shows the total
length of all genome components. Similarly, the column "Protein"
contains combined numbers of unprocessed proteins annotated for all
components of each genome. Clicking on a multicomponent virus name leads to
an additional graphic page showing schematic images of the genome components.
Links to displays of individual components are located beneath the scheme.
Multicomponent
genome of Influenza A virus
Global alignments with other
complete genomes from public databases
To compare a viral reference sequence with other (nearly) complete genomes or
corresponding genomic segments for the same species (former
"Genome neighbors") from DDBJ/EMBL/GenBank, click
on the hyperlinked number in the Nbrs column on a family page. The default view is a graphical
representation of pairwise alignments. For multicomponent viruses, the alignments for each genome segment are sequentially displayed.
Global alignment of a reference sequence with other complete sequences (sample)
How to retrieve nucleotide and protein
sequences of viral reference genomes
Nucleotide or protein sequences of all viral reference genomes can be retrieved from the corresponding Entrez database via the Entrez Nucleotide
or Entrez Protein hyperlinks located in the "All Viral Genomes" section of the left side
blue bar on the main page or other informational pages (including this one). To retrieve sequences for a particular virus group, use the
"Sequence Info" menu on a correspondent group page. For example, the links on the "Flaviviridae" page will bring up the nucleotide or protein sequences belonging exclusively to this virus order.
When the desired list of sequences is displayed, select a format (FASTA, ASN.1, XML, etc.) from the option box and click "Display". Then save the
result to your local storage device. To further narrow the search, add to an Entrez query one or more specific terms, e.g. "Hepatitis C
virus[Organism]" or "(polymerase OR replicase)[Protein Name]" without quotes.
As viral reference sequences are also part of the NCBI RefSeq
collection, they can be downloaded via the NCBI
RefSeq Web page. The direct link "RefSeq FTP" is located on the
left side blue bar. Alternatively, one can use NCBI eUtils.