NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Coffin JM, Hughes SH, Varmus HE, editors. Retroviruses. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press; 1997.

Cover of Retroviruses

Retroviruses.

Show details

Retroviral Taxonomy, Protein Structures, Sequences, and Genetic Maps

.

As noted in Appendix 1, the last 15 years have seen a huge accumulation of data regarding retroviruses and the sequence and structure of their genomes and proteins. As an aid in accessing these data and to reading the chapters, we present here a digest of useful information, including a list of commonly encountered retroviruses, the principal definitions and rules that govern the naming and classification of the viruses, their genes and proteins, and the noncoding regions of their genomes, and information to help the reader access and view retroviral protein and nucleic acid structures.

This Appendix also provides structural maps of representative retroviral genomes from each of the seven retroviral genera and two unclassified groups. These maps illustrate the genetic organization of viruses from each genus and provide a basis for understanding the various gene expression strategies employed by retroviruses. Each map is correlated with specific, newly created nucleic acid database entries, allowing ready access to, and manipulation of, the corresponding sequences.

Retroviral Taxonomy

By long-standing tradition, the discoverer of a virus has the right to name it. Unlike specific names for cellular organisms, there are no hard and fast rules for virus nomenclature. However, with time, a few conventions have developed.

  • The name of the virus generally includes the animal from which the virus was first isolated, along with a distinguishing characteristic, usually a disease associated with infection. However, the name is an identifier and does not necessarily describe the properties of a virus. Thus, human foamy virus (HFV) is now generally thought to be a chimpanzee virus that only rarely infects humans; nevertheless, the name remains unchanged. Similarly, some avian leukosis virus (ALV) strains cause diseases other than leukosis (leukemia), but they still retain the name, unless they have been significantly changed (e.g., by insertion of an oncogene).
  • A virus name does not define its taxonomy but should not confuse the taxonomy. For example, viruses that belong to one genus should not be given a name that is the same as, or implies relationship to, viruses in other genera. For example, the lentivirus that causes AIDS could not be named HTLV-III (human T-cell leukemia virus III), because this name implied a direct relationship to HTLV-1 and HTLV-2. In accordance with convention, AIDS viruses were named HIV (human immunodeficiency virus) (Coffin et al. 1986).
  • Viruses should not be named after persons. This is a relatively recent convention, and there are numerous older counterexamples (such as Rous sarcoma virus).

As with all viruses, taxonomy of retroviruses is codified by a subcommittee of the International Committee on the Taxonomy of Viruses (ICTV) (Murphy et al. 1995; see also their web site at http://www.ncbi.nlm.nih.gov/ICTV/). (For a review, see Coffin 1992.)

Although retroviruses (family Retroviridae) were at one time divided into three subfamilies (Oncovirinae, Spumavirinae, and Lentivirinae), these taxonomic classifications are no longer used. Instead, retroviruses are currently divided into seven genera, and two more groups (the fish retroviruses and the Gypsy group of endogenous viruses of insects) await classification. Classification is based largely on sequence similarity within the pol gene, but other correlated features, including the presence or absence of additional genes, are sometimes also used. It is important to keep in mind that major biological divisions, such as type of disease caused by a virus, tropism for certain host species or target cell types, virion morphology, or simple versus complex lifestyle, do not correlate with evolutionary relationships of retroviruses and are not a proper basis for their taxonomic classification.

A listing of retroviruses by genera, including commonly used abbreviations and important biological properties, is presented as Table 1.

Table 1. Retroviruses.

Table 1

Retroviruses.

Structural and Genetic Features of Retroviral Genomes

The genomes of replication-competent retroviruses range in size from 7 to 12 kb. A list of the lengths of representative retroviral genomes is given in Table 2. The following is a brief description of the RNA and DNA forms of retroviral genomes. More detailed descriptions of their synthesis, structure, and function can be found in specific chapters in this volume.

Table 2. Sizes of Retroviral LTR Components (U3-R-U5) and Genomes.

Table 2

Sizes of Retroviral LTR Components (U3-R-U5) and Genomes.

Regions and Features of Replication-competent Retroviral RNA

Cap Site

The first coded nucleotide at the 5′end of the viral RNA subunit, presumably the initiating nucleotide in an RNA transcript of proviral DNA; linked to a methylated “cap”ribonucleotide by a 5′-5′linkage.

R (Repeat)

A short (15–250 nucleotides) sequence repeated at both ends of genomic RNA, whose boundaries are defined by the positions of RNA transcription initiation and polyadenylation; also present twice in viral DNA residing between U3 and U5 in each long terminal repeat (LTR). In a majority of the retroviruses, R contains the polyadenylation signal sequence (AAUAAA).

TAR (Tat-responsive)

In primate lentiviruses, a highly structured region of R that serves as the binding site for the Tat trans-activator.

U5 (Unique 5′)

A sequence (70–250 nucleotides) positioned between R and the primer binding site (PBS). U5 is present once in genomic RNA and twice in viral DNA as part of the LTR.

PBS (Primer-binding Site)

A region (usually 18 nucleotides long and beginning 5′TGG) adjacent to U5 and complementary to the 3′terminus of a specific host tRNA species. The PBS is the binding site for a tRNA that functions as the primer for reverse transcriptase to initiate synthesis of the minus (–) strand of viral DNA.

gag

The first of four coding domains found in the genomes of all known replication-competent retroviruses; encodes a polyprotein (Gag) whose cleavage products are the major structural proteins (matrix [MA], capsid [CA], and nucleocapsid [NC]) of the virus core. Spumaviruses are an exception to this general rule: Their Gag proteins are not extensively cleaved.

pro

The second of four coding domains found in the genomes of all known replication-competent retroviruses encoding part of a polyprotein (Gag-Pro or Gag-Pro-Pol) whose cleavage products always include protease (PR) and sometimes dUTPase (DU) (in B-type and D-type viruses).

pol

The third of four coding domains found in all known replication-competent genomes encoding part of a polyprotein (Gag-Pro-Pol), whose cleavage products always include reverse transcriptase (RT) and integrase (IN) and, in some lentiviruses, dUTPase (DU). In spumaviruses, pol is expressed via a spliced mRNA as Pol-Pol polyprotein.

env

The last of four coding domains found in the genomes of all known replication-competent retroviruses, encoding a polyprotein (Env) whose cleavage products SU (surface) and TM (transmembrane) are the structural proteins of the viral envelope.

SD (Splice Donor Site)

A site (or sites) at which an upstream (5′) portion of viral RNA is joined to a downstream (3′) portion of viral RNA (at the splice acceptor, SA) to form spliced, subgenomic mRNA. Note that the “5′splice site”has become a commonly used term for cellular RNAs, but it is not recommended for retroviruses due to the complex arrangement of these sites in some viruses.

SA (Splice Acceptor Site)

A site (or sites) at which a downstream (3′) portion of the viral RNA is joined to an upstream (5′) portion (at the splice donor, SD) to form spliced subgenomic mRNA.

RRE (RexRE) (Rev or Rex Response Element)

A highly structured sequence in lentivirus (or HTLV-BLV) RNAs that serves as a binding site for the Rev (or Rex) protein, which aids in the transport of unspliced or partially spliced RNAs from nucleus to cytoplasm.

CTE (Constitutive Transport Element)

A sequence found in D-type and ASLV RNAs that promotes transport of unspliced RNAs from nucleus to cytoplasm in the absence of a virus-coded protein such as Rev or Rex. Similar elements probably exist in other "simple" retroviruses, but have not yet been found.

PPT (Polypurine Tract)

A purine-rich sequence of 7–18 nucleotides located immediately upstream (5′) of U3 that is cleaved during reverse transcription to produce the RNA primer for synthesis of the plus (+) strand of viral DNA. Some retroviruses (such as HIV) may also use one or more internally positioned polypurine tracts to prime plus-strand DNA synthesis at multiple sites.

U3 (Unique 3′)

A sequence of several hundred (∼190–1200) nucleotides positioned between PPT and R near the 3′end of viral RNA, present once in viral genome RNA and twice in viral DNA as part of the LTR. U3 contains promoter-enhancer sequences that control viral RNA transcription from the 5′LTR. In some viruses, the U3 region may contain coding sequences.

Poly(A) Tract

A homopolymer of 50–200 adenylic acid residues following the R sequence at the 3′end of the viral RNA. The poly(A) tract is not encoded in the viral genome and is added posttranscriptionally. A signal for polyadenylation (AAUAAA) is generally present about 15–20 nucleotides (usually, but not always, within R) upstream (5′) of the site of polyadenylation. In viruses of the HTLV-BLV group, the AAUAAA sequence that directs polyadenylation is at a more distant site but can function because of extensive secondary structure (the RexRE sequence) in the intervening region (see Chapter 6).

Regions and Features of Retroviral DNA

LTR (Long Terminal Repeat)

A region of several hundred (∼300–1800) base pairs composed of U3-R-U5 (5′to 3′). LTRs are located at both ends of the unintegrated and integrated (proviral) linear DNA. The LTRs are also found in the closed circular forms of retroviral DNA (such circular forms can contain one or two LTRs). In proviral DNA, the 5′LTR lacks two nucleotides from the 5′end of U3 and the 3′LTR lacks two base pairs from the 3′end of U5. A list of the sizes of the U3-R-U5 regions is given in Table 2.

IR (Inverted Repeat; Also Called the att Site)

Short sequences (∼3–25 bp) that form a perfect (or slightly imperfect) inverted repeat at the ends of the LTR and provide recognition sites for integrase. Two base pairs are missing from the IRs in proviral DNA.

Retroviral Proteins

The following is a brief description of proteins encoded by retroviral genes. For more detailed descriptions of their synthesis, structure, and function, see Chapter 2).

Gag Proteins

MA (Matrix)

Virion structural protein derived from the amino-terminal domain of the Gag polyprotein. MA is associated with the inside of the virus envelope. Most but not all MA proteins are myristylated at amino acid position 2 (always Gly). The MA of avian sarcoma/ leukemia virus (ASLV) is acetylated at the initiating Met.

CA (Capsid)

Principal structural protein of the virion core derived from the central region of the Gag polyprotein.

NC (Nucleocapsid)

Small, basic protein derived from the carboxy-terminal domain of the Gag polyprotein. NC is tightly bound to the viral genomic RNA forming a ribonucleoprotein complex within the core.

Pro Proteins

PR (Protease)

Aspartyl-protease that cleaves the Gag, Gag-Pro, Gag-Pro-Pol polyproteins to produce viral proteins in their mature forms; in some viruses (e.g., MLV), a carboxy-terminal fragment of Env is removed by PR as well. Retroviral proteases function as homodimers.

DU (dUTPase of Type-B and Type-D Retroviruses)

DU is not essential for viral replication in vitro. The pro-encoded DU is not closely related to the pol-encoded DU that is found in some lentiviruses (see below).

Pol Proteins

RT (Reverse Transcriptase)

A DNA polymerase that can copy RNA or DNA templates and has an integral RNase H activity. RT uses a specific cellular tRNA primer to initiate minus (–)-strand DNA synthesis, and uses the RNase-H-resistant PPT to initiate plus (+)-strand DNA synthesis. In some retroviruses, RT is a monomer, but avian type-C and HIV-1 RTs are heterodimers.

IN (Integrase)

Enzyme responsible for removing two bases from the end of the LTR and inserting of the linear double-stranded DNA copy of the retroviral genome into host cell DNA (i.e., proviral formation).

DU (dUTPase of Certain Lentiviruses, Including FIV, EIAV, and CAEV)

A dUTPase is encoded by the pol open reading frames (ORFs). DU is not essential for viral replication in vitro.

Env Proteins

SU (Surface)

Env protein subunit that mediates viral adsorption via binding of specific cell surface receptors. SU is encoded in the amino-terminal region of the env ORF and is translated from spliced subgenomic env mRNA. The amino terminus of SU is defined by a cleavage site that is recognized by a cellular signal peptidase. This cleavage removes the signal peptide from the env precursor polyprotein. The carboxyl terminus of SU is defined by a cellular furin protease cleavage site that separates SU from TM. SU is closely associated with TM by noncovalent interactions and/or by disulfide bonds. All known retroviral SU proteins are modified by N-linked glycosylation.

TM (Transmembrane Envelope)

Integral envelope protein subunit that mediates virus entry by triggering virus–host cell membrane fusion. TM is encoded in the carboxy-terminal region of the env ORF and is released from the env precursor polyprotein by cleavage of a cellular protease (furin). Most, but not all, TM proteins are modified by N-linked glycosylation.

Accessory Proteins

Complex retroviruses use a variety of virally encoded accessory proteins to modulate various aspects of their replication and infectivity. A brief overview of these proteins is given in Chapters 6 and 7. These include:

Lentiviruses

Tat ( Trans -activator)

A low-molecular-weight protein that activates transcription by binding to TAR.

Rev

A protein that binds to the RRE and facilitates the transport of unspliced and incompletely spliced mRNAs to the cytoplasm.

Nef (Negative Factor)

A myristylated intracellular protein that reduces the level of CD4 on the cell surface, and also stimulates some infected cells to divide.

Vpr (Viral Protein r)

A protein, found in virions, that causes infected cells to arrest in G2 and may also promote transport of the preintegration complex into the nucleus after reverse transcription.

Vpu (Viral Protein u)

An intracellular protein that causes degradation of newly synthesized CD4 and also provides a function that aids viral assembly and release.

Vif (Virion Infectivity Factor)

A largely intracellular protein that provides an unknown function in some types of infected cells. It aids the production of infectious virions.

Vpx (Viral Protein x)

A protein in some primate lentiviruses which incorporates one of the functions of Vpr (nuclear transport).

HTLV/BLV

Tax ( Trans -activator/X Region)

A transcription factor that stimulates expression by binding to specific response elements (TREs) in the U3 region of the LTR.

Rex

A protein that binds to the RexRE region and facilitates the transport of unspliced and incompletely spliced RNAs to the cytoplasm.

Spumaviruses

Tas ( Trans- activator of Spumavirus) (also Bel-1)

A trans-activator, analogous, but unrelated, to Tax, that stimulates transcription by binding to specific sites in the U3 portion of the LTR.

Bet (Bel-1 plus Bel-2)

A protein of unknown function, not essential for replication and expressed at high levels in infected cells.

B-type Viruses

Sag (Superantigen)

A cell surface protein that interacts with specific Vβ chains of the T-cell receptor to induce them to send an activation signal to the infected cell. When expressed from an endogenous MMTV provirus, Sag proteins lead to depletion of specific T-cell subsets.

Fish Viruses

ORF-a and ORF-b encode proteins related to cellular cyclins.

Nomenclature for Retroviral Proteins

gag-, pro-, pol-, env -encoded Proteins

All replication-competent retroviruses identified to date contain four coding domains termed gag, pro, pol, and env. These genes are designated based on functional homology and may not always display strong sequence similarity (relatedness) when different viral groups (genera) are compared. A standardized nomenclature for gag, pro, pol, and env gene products originally proposed by Leis et al. (1988) has been generally accepted and supplies the basic rules governing usage. Virion proteins of defined function or location are assigned the two-letter abbreviations discussed above. This convention is used preferentially to any others. Precursors and less well-defined genes are designated according to their apparent molecular weights in thousands (August et al. 1974).

Proteins are designated by a lowercase p, glycoproteins by a gp, and phosphoproteins by a pp placed before the number indicating the approximate molecular weight (e.g., p24, gp120, and pp12).

Precursor proteins are designated by Pr, rather than p, and glycosylated precursor proteins are designated by gPr rather than gp. The coding regions that give rise to the precursor proteins follow the molecular-weight designations (e.g., Pr180 Gag-Pol, gPr80 Env).

Polyproteins not shown to be further processed (such as onc-fusion proteins) are designated by a capital P (as in P140 Gag-Fps).

The name of the virus from which the proteins are derived can be prefixed to the protein designation (e.g., HIV-1 p24, HIV-1 gp120, MLV p15, MLV pp12, and MLV gp70), and the two-letter name appended for additional clarity (as in HIV-1 gp120SU).

This system allows a newly identified polypeptide to be added easily without disturbing the designations already in use, and some physiochemical information is conveyed by the name of the protein. It is not the intent of this scheme to give precise values to molecular weights or to assign exact functions, but to provide a simple and generally applicable nomenclature.

Other Retroviral Proteins

Other proteins encoded by retroviruses, such as accessory proteins and oncogene products, are designated by using the gene name, in Roman font and with the first letter of the gene name uppercase. Thus, the product of the nef gene is called Nef. The same convention is used when referring generically to products of any retroviral gene: Env protein, Gag-Pro-Pol precursor, etc.

Retroviral onc Sequences and Proteins

In addition to gag, pro, pol, and env, the genomes of a number of avian and mammalian type-C retroviruses contain modified cellular sequences called oncogenes. These sequences share important features: They encode a protein (or proteins) unnecessary for viral replication but required for the induction and maintenance of the transformed phenotype of the infected cell. Viral oncogenes are closely related to sequences that occur in the uninfected host cell but are not derived from endogenous proviruses. These viral sequences are generically termed v-onc sequences or genes to distinguish them from their cellular progenitors (c-onc). Because of the importance of distinguishing among the relatively large number of different onc genes, a system for naming individual viral onc genes and for distinguishing them from their cellular homologs (Coffin et al. 1981) is used throughout this volume. In general, the conventions follow those for genes and proteins: Gene names are three letters long, lowercase italics and protein names use the same letters, with the first letter capitalized, Roman. It is important to remember that oncogene names are trivial mnemonics derived from the name, or some other memorable feature of the virus(es) in which they were found. Like virus names, they are not intended to codify function or disease specificity. Thus, they do not change despite changing information concerning these properties. The same nomenclature has been adopted for oncogenes not found in viruses. Retroviruses that carry onc genes can be found in the Appendix sections covering the avian and mammalian C-type viruses. Complete tables of virus-encoded and virus-activated cellular oncogenes can be found in Chapter 10).

Structure of Retroviral Proteins and Nucleic Acids

The fact that there is a direct relationship between the structure of the complex macromolecules involved in retroviral replication and their function has long been clear. However, the actual three-dimensional structures of some retroviral components have only recently become available. Conventionally, structural information is stored (in tabular form) as sets of coordinates. The coordinates define the positions, in three dimensions, of the individual elements that make up the structure. There are a variety of programs that can take such sets of coordinates and create an image of the three-dimensional structure of the macromolecule. These programs vary considerably in sophistication and in their requirements for specific operating systems and computer hardware. Until relatively recently, essentially all such programs ran on UNIX machines and most required considerable computing power to run efficiently. Although programs of this type are exceptionally useful, they require a significant commitment (in both time and money) from the user and, as such, do not provide general access to structural information. There are now programs that make it possible for anyone with a simple desktop computer (both Windows and Macintosh operating systems) to view macromolecular structures from any vantage point. Although these (relatively) simple programs lack the power and sophistication of the programs used by researchers specializing in structural problems, they do provide an access to the structure of macromolecules that was previously unavailable. This short section is intended to provide sufficient information to help a novice obtain access to these programs and to the coordinate files used by the programs to erect three-dimensional structures.

Creating Structural Images

To create the image of a structure, both a viewing program and a set of molecular coordinates are needed (where both programs and coordinates can be obtained is explained below). Not all structures have the same level of detail; for some proteins, only the α-carbon backbone coordinates are available; in others the positions of the amino acid side chains are also specified. There are analogous variations in the level of detail for nucleic acid structures.

Obtaining Programs

Although there are a number of sites on the World Wide Web (some of which are listed in Table 3; also see Appendix 1) where information on structure is readily available, a good place to start is the Brookhaven Protein Data Bank (PDB): http://www.pdb.bnl.gov/. This is a convenient place to obtain coordinates and can also be used to obtain programs that are useful in the creation and analysis of three-dimensional structures.

Table 3. Protein Structures.

Table 3

Protein Structures.

A program that is useful for viewing structures on the kinds of desktop computers available in almost every laboratory (Macs and PCs) is Ras Mol. The program and documentation are free and can be obtained either through the PDB or through the Ras Mol home page: http://www.umass.edu/microbio/rasmol. Although Ras Mol is fairly simple to use, anyone who is unfamiliar with the program would do well to look at the home page, which has, in addition to the program and the manual, considerable nontechnical information and tutorials.

Getting Coordinates

We have listed a number of PDB accession numbers for structures that are relevant for those interested in retroviruses; however, new structures are appearing all the time, and the list that is provided will be out of date before this book is published. Although PDB entry information can be found in the published literature, it is usually more convenient to look in the PDB itself. Since the PDB has a powerful browser, finding the coordinates for useful structures is usually straightforward; however, there are other ways to search for PDB entries, for example, using the Entrez section of the National Center for Biotechnology Information (NCBI): http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Structure. Having identified the name of an interesting PDB entry, the coordinate files can be quickly downloaded from the PDB.

With the coordinates, and an appropriate program (like Ras Mol), it is a simple matter to view (from any vantage point), modify, and print images of three-dimensional structures. Because there are already a number of important retroviral structures available (see Table 3), it is possible to obtain insights into the structure of retroviral proteins and nucleic acids. As more structures (and more sophisticated structural programs) become available, both the usefulness and accessibility of structural information will increase. We strongly suggest that anyone who has a real interest in retroviruses to use Ras Mol (or some other program that can project three-dimensional structures) to view retroviral protein and nucleic acid structures. Not only can such images be manipulated to provide a useful vantage point, but what can be learned from looking at a moveable image (especially a three-dimensional projection) is often much greater than what can be obtained from a static image.

Other Useful Resources

Table 4 lists some other useful structural resources (or other resources that could be useful in the context of structural analyses). Sequence information, and its relationship to structure, has been covered extensively in Appendix 1. In some cases, individual laboratories maintain web sites that contain useful structural information; in particular, there is a specific site, for HIV-1 protease (http://www-fbsc.ncifcrf.gov/HIVdb) that contains useful structural information about both wild-type and drug-resistant forms of HIV-1 protease, PDB entry names, and some useful (and aesthetically pleasing) images.

Table 4. Other Protein Structure Resources.

Table 4

Other Protein Structure Resources.

Viral Genome Maps

Retroviral replication and the expression of retroviral genes are complex processes. Packaging constraints limit the size of retroviral genomes to the order of 8–10 kb which means that the sequences that make replication and expression possible are confined to a relatively small region. Since the last version of this book was published, it has become increasingly important to be able to examine a segment of any given retroviral genome and recognize one or more of its important features. Although some features and elements can be easily identified by inspection, the analysis of the sequences of retroviral genomes often poses special problems: There can be issues that arise from the sheer volume of sequence information available in the databases, for example, HIV-1 sequences. Fortunately, there are dedicated databases that present this information in useful ways (Myers et al. 1995; see HIV-1 sequences compendium at http://hiv-web.lanl.gov). In most cases, the first difficulty that must be overcome is the identification of a sequence (or sequences) that is (are) most appropriate for the analytical or comparative analysis intended. In defining useful sequences, several factors should be taken into consideration. A substantial percentage of cloned proviruses are replication-defective and a number of the sequences that have been deposited in the databases are derived from such defective clones (which may not be clearly identified as being defective). Even when a sequence is identified as defective, the exact nature of the defect may not be clear, which is the case for several full-length sequences. All but one (RSV) of the rapidly transforming retroviruses are replication-defective, and many of these defective viruses have sustained changes (deletions, insertions, truncations, and nucleotide substitutions) in both the viral genomic and oncogene sequences they contain, in addition to a large deletion associated with the acquisition of the cellular oncogene. It is also important to recognize that the large majority of retroviral sequences found in the databases represent fragments rather than complete genomes and that there are a number of important and well-studied retroviruses (e.g., SNV and HaSV) for which complete sequence information is not present in the database. Despite the best intentions of those who derive and deposit sequences, as well as those who manage and maintain the databases, many retroviral entries contain errors, are incomplete, or contain outdated information. For example, sites of cleavage of viral proteins were often identified after the primary sequence was deposited; the entries have only rarely been updated. Thus, for some analyses, it has been both necessary and appropriate to assemble composite full-length genomic sequence files using two or more entries of partial, yet high-quality, sequences rather than use a complete sequence entry of lesser quality or with less-detailed information. Similarly, it should be appreciated that some full-length sequences of commonly used clones are composites because more than one viral segment was used to assemble the clone (e.g., the pNL43 clone of HIV-1). Finally, even when complete (and reasonably correct) sequences are present, the conventions used for numbering different entries vary considerably.

The Appendices to this book were put together to help resolve these problems. Appendix 1 contains information about the databases and describes approaches for the comparative and analytical analyses of nucleic acid and protein sequences. Here we provide, as examples, maps of representative retroviral genomes from each of the retroviral genera and several subgroups. These maps are based on existing entries, carefully edited and, in some cases, joined, and renumbered to correspond to a common convention. The sequence files that were used to derive each map have been deposited in the retroviral genome section of the NCBI GenBank database (http://www.ncbi.nlm.nih.gov/retroviruses/), providing new and separate database entries that correspond exactly to each of the maps in this Appendix. An accession number is provided with each map, and related accession numbers are also given when useful (such as those used to compile the full-length entries). In this way, each entry can be updated periodically after the book is printed and these maps become outdated.

Retroviral nucleic acids and proteins present special problems for sequence analysis (see Appendix 1). In some cases, even the most sophisticated of the analytical methods now available do not permit the identification of important sites/sequences or the extent to which elements with identifiable motifs are utilized. For this reason, it is often necessary to resort to direct experimental characterization of the viral proteins and/or nucleic acids. The maps provided here contain information about the positions of most of the common elements found in proviral DNA (LTRs, ORFs), genomic RNA (cap site, U5, PBS, PPT, U3, poly[A]), the major mRNAs (cap site, splice junctions, poly[A]), and the well-characterized polyprotein products (cleavage sites) and mature proteolytic products and their modifications (myristylation, glycosylation, phosphorylation). Special elements present in a particular viral subtype(s) are discussed (briefly) with the individual maps and details given in the appropriate sections of the book. However, some general considerations, even though they may pertain to only a subset of the known retroviruses, are worth discussing in this overview.

Transcriptional Regulatory Elements

Transcription of the genomic RNAs of all retroviruses begins at (and thus defines) the first nucleotide position of R. Most of the transcriptional promoter and enhancer elements that produce genomic RNA are located in the LTR (U3) region. In some cases (e.g., HTLV-1), this major promoter has sequences that bind a viral protein, as well as sequence elements that bind cellular transcription factors. In addition to the major promoter found in the LTR of all proviruses, there are now a few well-documented cases of retroviruses whose DNA genomes contain one or two additional noncanonical promoters.

RNA Elements

Retroviral RNAs contain a number of special RNA elements that are required for viral replication. Some of these elements are readily recognizable by sequence inspection (PBS, PPT, poly[A] signal), and others are easily deduced (R, U3, U5); however, some elements are not obvious by simple inspection (packaging sequences, CTEs, specialized elements such as TAR and the RRE found in lentiviruses).

RNA Splicing

All replication-competent retroviruses produce at least one spliced mRNA (for env); complex retroviruses produce several (for accessory proteins). In presenting mRNA expression, we have taken a relatively conservative view: Only major (abundantly expressed) mRNA species known to encode proteins are shown. For the complex retroviruses (especially HIV), more complex expression patterns can be found in the literature; however, it is unclear which of these complex spliced mRNAs (particularly those whose existence has been demonstrated only by PCR analysis) have biological relevance. Multiple sequences matching consensus splice donor or acceptor motifs may reside within a region that is expected to produce a splice junction, making it difficult to predict splicing patterns by computer analysis alone. Furthermore, several known splice junctions are generated using donor and acceptor sites that are poor matches to the splice consensus motifs. Thus, there may be other functional constraints imposed on the sequences in these regions.

Open Reading Frames (ORFs)

Three factors complicate the correlation of retroviral nucleic acid sequence with their encoded proteins: ribosomal frameshifting, readthrough suppression of termination codons, and splicing, which is, in some cases, quite complex. It should be noted that a number of retroviral ORFs extend 5′of the frameshift or splice acceptor site. These additional upstream segments do not encode a retroviral protein, and, in generating the drawings, these upstream regions have been omitted even though they are technically part of the ORF. Despite this, many protein database entries were derived as computer-generated translations of full-length ORFs and therefore often contain amino-terminal amino acid sequences that are now known not to exist in the protein. To make matters worse, in most cases, the actual full-length amino acid sequences of the Gag-Pro and Gag-Pro-Pol proteins do not exist in the databases. Rather, translations of pro and pol ORFs have been entered without the appropriate upstream Gag and Gag-Pro amino acid sequences. The real issue, however, is not how the drawings were made or what entries do or do not exist, but that it is necessary to understand the patterns of splicing and translational suppression to correctly derive the corresponding protein sequence from the nucleic acid sequence. A secondary problem derives both from the complexity with which some retroviral RNAs are spliced and from the small size of some of the accessory proteins expressed by complex retroviruses. When is an ORF biologically relevant? Size is not always a reliable guide; although it is true that all large ORFs are “real,”so are some (but not all) small ORFs. In addition, lacking an understanding of the complete splicing patterns, or amino acid sequence, it is not possible to predict which small ORFs are expressed together in the same protein. We have taken a position on ORFs similar to the one we took with mRNAs; only small ORFs with associated functions are shown in the maps presented here. Although this choice may mean overlooking what will ultimately prove to be an interesting protein (e.g., HFV bel3), it is preferable to presenting as important an ORF that, when more is known, will be shown to be uninteresting.

Protein Synthesis and Modification

Is it difficult not only to discern from nucleic acid sequences how retroviral polyproteins are made, but also to recognize that complex patterns of proteolytic cleavage are required to produce the mature proteins found in mature virions. Although there are common features in the structure and processing of both the Gag, Gag-Pro, and Gag-Pro-Pol polyproteins, there are sufficient idiosyncrasies that it is impossible to predict most maturation cleavage sites with certainty. Not only are there differences in the number and placement of the cleavage sites, but also the recognition sites are relatively diverse, and many retroviral proteins contain sequences that would appear to be suitable sites for cleavage by the corresponding retroviral protease that remain uncleaved. One obvious explanation is that cleavage specificity is determined both by the sequence of the substrate and by the structure of the folded (poly) protein. For this reason, sites that would otherwise be cleaved are protected by being embedded in a highly structured protein. Furthermore, there are several instances where multiple cleavages are known to occur in regions between retroviral proteins. The significance of these cleavages is not known, nor is the function of the products that are generated by these proteolytic cleavages (e.g., ASLV, p2a, p2b, p10). Similarly, there are also cases where there appears to be a significant amount of heterogeneity in the cleavage sites of some proteins (e.g., carboxy-terminal cleavage of ASLV CA).

Although Gag, Gag-Pro, and Gag-Pro-Pol are cleaved by viral proteases, the Env protein precursors are cleaved by cellular proteases. Signal peptidases remove the signal sequence (∼15–25 amino acids) from the amino terminus. Here too the exact cleavage sites are difficult to predict based on sequence. The SU and TM junction is cleaved by furin, and in this case, cleavage site recognition is relatively straightforward (K/R-K/R-X-K/R-X; where X is any amino acid). Various retroviral proteins also undergo other posttranslational modifications. Gag proteins are either acetylated or myristylated at position 1 or 2, respectively, and some Gag proteins and accessory proteins are phosphorylated. SU proteins undergo N-linked glycosylation as do some, but not all, TM proteins. Sequence analysis can be used to identify quickly potential sites of glycosylation (N-X-T-X or N-X-S-X; where X is not P) but obviously cannot predict which of these sites are efficiently utilized, although such sites probably are used in most retroviral Env proteins.

Map Conventions

Maps detailing the genetic organization and features of a representative member (or members) of each of the major retroviral groups are presented in the following sections. Less-detailed maps of several well-studied oncogene-containing retroviruses are also provided. The maps are numbered as RNA genomes, beginning at the site of initiation of transcription of genomic RNA and ending at the site of genomic RNA polyadenylation. Abbreviations and conventions used throughout this Appendix follow. ORF boundaries are defined using the following criteria:

  • The 5′ boundary of an ORF may be defined by the position of any of four elements: (1) the site of translation initiation (AUG, CUG), (2) the splice acceptor site if translation of the protein initiates upstream of a splice junction, (3) the translation frameshift site for appropriate pro, pol, and pro-pol ORFs, and (4) the termination codon suppression site for appropriate pro-pol ORFs.
  • The 3′ boundary of an ORF may be defined by the position of any of four elements: (1) the site of translation termination (UAA, UAG, UGA), (2) a splice donor if translation of the protein continues across a splice junction, (3) the translation frameshift site for appropriate gag, pro, and gag-pro ORFs, and (4) the termination codon suppression site for appropriate gag ORFs.

Features

5′CAP

Present in all retroviral RNAs as a modification of the first coded base of the RNA genome, which is always labeled position 1 of the sequence.

U5, U3, and R

Those portions of the LTR unique to the 5′(U5) or 3′(U3) end of the RNA genome, or repeated (R) at both ends of the RNA genome.

PBS

tRNA primer-binding site used to bind the host tRNA used in the initiation of minus (–)-strand DNA synthesis.

SD (Splice Donor)

The nucleotide position shown is the last present in the upstream exon.

gag

Gene encoding the MA, CA, and NC proteins. Certain mammalian type-C genomes also encode a glycosylated form of the gag gene product (gp85), which is initiated further upstream than Pr70.

pro

Gene encoding the viral protease. For most retroviruses, the pro ORF is −1 with respect to the gag ORF and a Gag-Pro polyprotein is expressed by a −1 frameshift during translation. In mammalian type-C retroviruses and WDSV, the pro ORF is in the same reading frame as the Gag ORF. In avian type-C retroviruses, the Pro ORF is in-frame and contiguous with the gag ORF.

pol

Gene encoding the viral reverse transcriptase and integrase proteins (see Chapter 7). In most cases (except spumaviruses), the gene product is synthesized as a 180-kD Gag-Pro-Pol precursor polyprotein. In mammalian type-C, HTLV/BLV, lentiviruses, WDSV, and Gypsy, the pol ORF is in-frame and contiguous with the pro ORF producing equimolar amounts of Pro and Pol proteins. In ASLV, type-B, and type-D retroviruses, the pol ORF is −1 with respect to the pro ORF and is expressed by a −1 frameshift during translation. Spumaviruses express the pol ORF using a spliced subgenomic mRNA (Chapter 7).

SA (Splice Acceptor)

The nucleotide position shown is the first nucleotide of the downstream exon.

env

Gene encoding the envelope protein. The Env polyprotein is synthesized from a spliced, subgenomic mRNA. In most retroviruses, translation initiates at the first AUG triplet following the splice acceptor. In the case of ASLV, translation is initiated at the gag AUG which is upstream of the env splice junction. The env AUG in the Gypsy retrovirus spans the env mRNA splice junction.

Image Myrist.jpg (Myristylation)

Most, but not all, Gag polyproteins are myristylated at the 2 Gly position.

Image Acetyl.jpg (Acetylation)

ASLV (and possible spumavirus) Gag polyproteins are acetylated at the 1 Met position.

Image Phosphor.jpg (Phosphorylation)

Some Gag, accessory, and many onc-encoded proteins are phosphorylated at either Ser, Thr, or Tyr residues.

MA (Matrix Protein)

CA (Capsid Protein)

NC (Nucleocapsid Protein)

PR (Aspartyl-protease)

DU (dUTPase produced by some type-B, type-D, and lentiviruses)

RT (Reverse Transcriptase)

IN (Integrase)

SP (Envelope Signal Peptide)

SU (Envelope Surface Protein)

TM (Envelope Transmembrane Protein)

Image acceptor.jpg

Acceptor site for N-linked glycosylation (N-X-S-X and N-X-T-X where X is any amino acid other than proline.

PPT (Polypurine Tract)

This sequence serves as the primer for plus (+)-strand DNA synthesis.

AAA3′

The polyadenylic acid tail of viral genomic and subgenomic mRNAs.

Image blkuArr.jpg

Site of precursor polyprotein cleavage by PR. The accompanying amino acid number defines the amino terminus of the mature protein products.

Image openuArr.jpg

Site of precursor polyprotein cleavage by host-cell-encoded proteases. The accompanying amino acid number defines the amino terminus of the mature protein products.

Image frameshift.jpg

Site of translation frameshifting (–1) at many gag-pro and some pro-pol ORF junctions.

Image termcoden.jpg

Site of termination codon suppression at the Gag-Pro ORF junctions of mammalian type-C retroviruses and WDSV.

Image junction.jpg

Junction of a fusion of viral protein sequences with an oncogene protein sequence, respectively.

Image protspan.jpg

Protein translation spans adjacent exons (ORFs)

Image splice.jpg

Splicing patterns that produce subgenomic mRNA.

Image xdelete.jpg

Deletion of X nucleotides in length. In many cases, the deletions appear to involve recombination between short direct repeats. The precise positions of these deletions are thus inherently ambiguous, residing somewhere within the repeats. A number given in parentheses corresponds to the first base following the deletion in the undeleted “parental”virus.

Image yinsert.jpg

Insertion of Y nucleotides in length. A number given in parentheses corresponds to the first base following the insertion in the undeleted “parental”virus.

Numbering

  • The numbering is according to the genome, not the provirus, and base 1 is therefore the first base of R to which the 5′cap is added during RNA synthesis. To simplify comparisons, all retroviral genomes are numbered based on the genomic RNA structure: from the first nucleotide in R in the upstream (5′) LTR to the last nucleotide in R in the downstream (3′) LTR.
  • All genetic regions (except splice donors and acceptors) are numbered at the first base, never the last; thus, the end of a coding region is numbered as the first base of the termination codon. This allows determination of the exact size of any region by subtraction. Splice donor positions are defined as the last (3′) nucleotide of the upstream exon, and splice acceptor positions are defined as the first (5′) nucleotide of the downstream exon.
  • Initiation codons are indicated AUG, and termination codons are indicated UAA, UAG, or UGA.
  • Defective viral genomes are compared to the most closely related nondefective virus known. Corresponding nucleotide positions of the “parental”virus are shown in parentheses.
  • Numbers to the right of each polyprotein indicate the number of amino acids in the precursor polyprotein.
  • The numbering of the LTRs in the maps is for unintegrated viral DNAs, not proviruses. Proviruses are two bases shorter on each end than the viral DNAs presented in the maps.
  • Reading frames are defined as beginning with the base at the cap site; all reading frames in the genome whose first base has a number evenly divisible by 3 are therefore designated frame 3. If the remainder is 1, they belong to frame 2, and if it is 2, they are defined as frame 1.
  • Protein locations and features (such as glycosylation or myristylation sites) are numbered from the amino terminus of the precursor protein.
  • Cleavage sites are designated by the amino-terminal amino acid position of the downstream (carboxy-terminal) cleavage product.

Map 1: Type-B Retroviruses (See Map)

Taxonomy

Genus: Mammalian type-B retroviruses
Species: Mouse mammary tumor virus (MMTV)

Morphology

Virions exhibit prominent surface spikes and an eccentric condensed core. Capsid assembly occurs within the cytoplasm prior to transport to, and budding from, the plasma membrane.

Phylogeny

MMTV includes exogenous and endogenous viruses. Exogenous MMTV is transmitted vertically via milk. MMTV is associated with mammary carcinoma and T-cell lymphomas. Insertional activation of the mouse int, inthst, and wnt genes by integration of exogenous MMTV has been reported (Chapter 10). No oncogene-containing members are known.

Representative Example

Mouse Mammary Tumor Virus (MMTV)
NCBI GenBank Genome Accession Number: AF033807

Note: The MMTV reference sequence was generated by modifying the MMTPROCG (M15122) entry such that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Database and Reference Information

Nucleic acid database locus name and accession numbers:
MMTPROCG M15122 (complete provirus sequence)
MMTPROVR D16249 (incomplete provirus sequence)

Protein database locus names and accession numbers:

gag GAG_MMTVB P10258, M15122
MMTPROVR_1D16249
pro VPRT_MMTVBP10271, M15122
MMTPROVR_2D16249
pol POL_MMTVBP03365, M15122
MMTPROVR_3D16249
env ENV_MMTVBP10259, M15122
MMTPROVR_4D16249
PR73_MMTVBP10260, M15122
sag 1854855
53265

Sequence-related literature references:
Moore et al. (1987)

Features of the Mouse Mammary Tumor Virus

R:(1–15) The R region of MMTV is the smallest (15 nucleotides) reported for known retroviruses.
U5:(16–135)
PBS:(136–153) The primer for minus-strand DNA synthesis is tRNALys3.
gag:(313–2086) Proteolytic processing of Pr77 Gag produces MA, CA, and NC. Several cleavage products (pp21, p3, p8) of unknown function are produced from the region between MA and CA. Pr77 Gag is myristylated at amino acid 2Gly. The NC domain provides the amino-terminal region of DU (see below).
pro:(2082–2892) The pro ORF is expressed by using a –1 frameshift (2082), producing a Gag-Pro polyprotein (Pr110 Gag-Pro). The NC region of gag and the amino-terminal region of the pro ORF encode a p30 protein that exhibits dUTPase activity (Chapter 7). The viral protease is encoded in the carboxy-terminal region of the pro ORF.
pol:(2891–5576) The pol ORF is expressed by using two –1 frameshifts (2082, 2891) producing a Gag-Pro-Pol polyprotein (Pr160 Gag-Pro-Pol); the locations of the amino-terminal cleavage sites producing the mature RT and IN proteins have not been reported. Integration produces a 6-bp duplication of cellular DNA flanking the provirus.
env:(5364–7428) Both the env subunits are glycosylated. The SU subunit contains three N-linked glycosylation sites at amino acids 127, 143, and 297, and the TM amino acid sequence contains two N-linked glycosylation motifs at amino acid positions 498 and 557. The env region is believed to contain an internal transcriptional promoter and start site (6051) that produces a spliced (6143/7305) subgenomic sag mRNA (Chapter 6).
PPT:(7358–7376)
sag:(7378–8338) The sag ORF is located within the U3 region and encodes superantigen proteins (Chapters 8 and 10). Sag proteins contain five sites for N-linked glycosylation. The sag-encoded protein undergoes proteolytic processing by a cellular protease; Sag contains two furin cleavage sites. Sag has five glycosylation sites: 79, 89, 93, 131, and 146.
U3:(7377–8573) The U3 region of MMTV is relatively long (1197 nucleotides) compared to other retroviruses (except spumaviruses). The sag ORF (7378–8338) is located entirely within U3. The carboxy-terminal region of the env ORF also lies within U3; in the provirus structure, the U3 of the 5′LTR region may contain a transcription initiation site for a spliced subgenomic sag mRNA (Chapter 6).
R:(8574–8588)

Map 2: The Avian Type-C Retroviruses (See Map)

Taxonomy

Genus: Avian type-C retrovirus
Species: Rous sarcoma virus (RSV)
Avian leukosis virus (ALV)
Avian carcinoma virus, Mill-Hill virus 2 (MHV-2)
Avian myeloblastosis virus (AMV)
Avian erythroblastosis virus (AEV)
Avian myelocytomatosis virus 29 (MCV-29)
Fujinami sarcoma virus (FuSV)
Avian sarcoma virus UR2 (UR2SV)
Avian sarcoma virus Y73 (Y73SV)

Morphology

Viral particles exhibit a type-C morphology. Viral assembly and budding occur coordinately at the plasma membrane.

Phylogeny

These viruses have a widespread distribution and include both exogenous and endogenous viruses of chickens and some other birds. Isolates are classified into subgroups (A–J) based on host-cell receptor usage. Distantly related endogenous sequences are found in birds and mammals. Viral infections are associated with malignancies and other diseases such as wasting and osteopetrosis. Many oncogene-containing members of the genus have been isolated (see Table 5). Insertional activation of cellular oncogenes by integration of exogenous ALV has been reported (Chapter 10.)

Table 5. Oncogene-Containing Avian Type-C Retroviruses.

Table 5

Oncogene-Containing Avian Type-C Retroviruses.

Representative Example

Rous Sarcoma Virus (RSV), Prague strain, subgroup C

NCBI GenBank Genome Accession Number: AF033808

Note: The RSV reference sequence was generated by modifying the ALRCG (J02342) entry such that the sequence begins at the transcriptional initiation site and ends at the polyadenylation site (i.e., R to R).

Database and Reference Information

Nucleic acid database locus names and accession numbers:

ALVALVCGM37980
  HPRS-103JALV103JZ46390
RSVALRCGJ02342
  Prague CRERSV6V01197
  Prague C (duck-adapted)RSVSEQX68524
  Schmidt-Ruppin DD10652
FuSVACFJ02194
ASV UR2ACSUR2CGM10455
ASV Y73ACSY73CGJ02027
ASV CT10REASVXXY00302
MC29REMC29ZV01174
ACMMYCPJ02247
MH2 (E21)AC2E21CGM14008
IC10REIC10X13744
H19 (hamster)RERSVH19X15345
AEV (composite sequence)AEVPDNAX12707
LPDV (turkey)MGU09568U09568

Protein database locus names and accession numbers:

RSVALV
gag ALRCG_4S35430
GAG_RSVP
pol ALRCG_3S35435
POL_RSVP
env ALRCG_1S35427
ENV_RSVP
src ALRCG_5NA
KSRC_RSVP

Sequence-related literature references:
Schwartz et al. (1983)
Bieth and Darlix (1992)

Features of Rous Sarcoma Virus

R:(1–21)
U5:(22–101)
PBS:(102–119) The primer for minus-strand DNA synthesis is tRNATrp.
gag:(380–2110) Proteolytic processing of Pr76 Gag-Pro produces MA, CA, NC, and PR (see below). Three cleavage products (p2a, p2b, p10) of unknown function are produced from the region between MA and CA. p10 is phosphorylated (pp10). A cleavage product (p1) of unknown function is produced from the region between CA and NC. The carboxyl terminus of CA is variable (476, 55%; 478, 9%; 479, 36%). Pr76 Gag-Pro is acetylated at the 1Met position.
pro:(2111–2483) The pro ORF is in-frame with the gag ORF, and protease is expressed as the carboxy-terminal cleavage product (p15) of Pr76 Gag-Pro.
pol:(2482–5190) The pol ORF is expressed by using a –1 frameshift (2482), producing a Gag-Pro-Pol polyprotein (Pr180 Gag-Pro-Pol). Proteolytic processing produces the mature RT and IN proteins. RT is a heterodimer consisting of p68α and p95β subunits. The p95β subunit is not cleaved at the RT/IN cleavage site and thus contains both the RT (p68α) and IN (p32) domains. Integration produces a six-nucleotide duplication of host-cell DNA flanking the provirus.
env:(380–397, 5078–6863) Translation of env ORF initiates at nucleotide position 380 which also serves as the gag translation initiation site. The first six amino acids of gPr95 Env are encoded by the gag ORF from position 380 to the splice donor site at position 397. The SU protein has 13 glycosylation sites at amino acids 79, 142, 159, 179, 182, 226, 260, 292, 298, 308, 316, 383, and 393. The TM protein (gp37) has three N-linked glycosylation sites at amino acids 455, 503, and 584.
src:(7129–8707) The v-src-encoded protein (pp60 v-Src) is translated from a spliced subgenomic mRNA. v-Src is a tyrosine kinase and is autophosphorylated (pp60 v-Src). pp60 v-Src is membrane-associated via myristylation at the 2Gly position. SH2 and SH3 are regions of homology found in tyrosine kinases that are involved in protein-protein interactions.
PPT:(9047–9057)
U3:(9058–9291)
R:(9292–9312)

Map 3: Avian Myelocytomatosis Virus-29 (MC29) (See Map)

NCBI GenBank Genome Accession Number: AF033809

Note: A complete nucleotide sequence of MC29 has not been deposited in the GenBank or EMBL databases. A partial sequence (ACMMYCP, J02247) was used to construct the map provided here.

MC29 is a replication-defective virus that is a member of the ASLV genus. The entire pol ORF, part of the env ORF, and about half of the gag ORF are absent. In their place is a 1571-bp sequence (v-myc) that is homologous to exons 2 and 3 of the chicken cellular myc (c-myc) gene. A large ORF spans the gag and v-myc sequences and encodes a Gag-Myc fusion protein (P110 Gag-Myc). No other viral proteins are produced. The Gag domain of P110 Gag-Myc is not cleaved. Sites of cleavage of Gag are shown on the map for reference only. The c-Myc protein is a transcriptional activator. The v-Myc protein sequence within P110 Gag-Myc contains several identifiable functional motifs:

PEST:A domain rich in Proline (P), Glutamic Acid (E), Serine (S), and Threonine (T) residues.
BASIC:A domain rich in (+) charged amino acids that functions in DNA binding.
Helix-Loop-Helix:An acidic domain (i.e., rich in negatively charged amino acids) that consists of an unstructured loop flanked by two α-helical regions. This domain functions in transcriptional activation.
Leucine Zipper:This region contains regularly interspersed leucine (L) residues that are positioned on a common face of an α-helix. This region stabilizes homodimer (e.g., myc-myc) and heterodimer (e.g., myc-max) interactions that serve to regulate Myc function.

Table 1 lists several myc-containing viruses. Each virus has acquired its myc sequence via an independent recombination event. Thus, the location of the myc sequence, the size of deletion of the parental viral sequence, and the mode of myc gene expression are all unique. Similarly, the pathogenesis associated with each of these viruses is also distinctive.

Map 4: Fujinami Sarcoma Virus (FuSV) (See Map)

NCBI GenBank Genome Accession Number: AF033810

Note: The FuSV sequence was generated by modifying the ACF (J02194) entry such that the sequence begins at the transcriptional initiation site and ends at the polyadenylation site (i.e., R to R).

FuSV is a replication-defective member of the ASLV genus. The entire pol and env ORFs and much of the gag ORF are absent. Recombination has resulted in an insertion of a 2735-bp v-fps sequence derived from the chicken c-fps gene. A large ORF spans the remaining gag and v-fps sequence and encodes a Gag-Fps fusion protein (P140 Gag-Fps). No other viral proteins are produced, and the gag domain of P140 Gag-Fps is not cleaved. The sites of cleavage of the Gag precursor are shown on the map for reference only. The c-Fps protein is a tyrosine kinase that functions in intracellular signal transduction. The v-Fps protein sequence within P140 Gag-Fps contains several identifiable functional motifs:

SH2 (src homology 2):The second of three Src protein domains common in tyrosine kinases which promote protein-protein interactions.
ATP Binding:The site of ATP binding. ATP serves as the phosphate donor for Fps tyrosine kinase activity.
Active Site:The tyrosine-kinase-active site contains a key aspartic acid residue.
Tyrosine Phosphorylation:Autophosphorylation at amino acid 1073 activates Fps tyrosine kinase activity.

Table 1 shows several fps-containing viruses. It is thought that each virus has acquired its fps sequence via independent recombination events. Thus, the location of the fps sequence, the size of the deletion of the parental viral sequence, and the mode of fps gene expression are all thought to be distinct. Most v-fps-containing viruses are associated with sarcoma development.

Map 5: Type-C Mammalian Retroviruses (See Map)*

Taxonomy

Genus: Murine leukemia virus (MLV)-related retrovirus
Subgenus: Mammalian type-C retroviruses
Species: Murine sarcoma and leukemia viruses (MSV/MLV)
Feline sarcoma and leukemia viruses (FeSV/FeLV)
Gibbon ape leukemia virus (GALV)
Woolly monkey sarcoma virus (WMSV)
Porcine type-C virus
Guinea pig type-C virus

Subgenus: Reticuloendotheliosis viruses (REV)
Species: Avian reticuloendotheliosis virus
Spleen necrosis virus (SNV)

Subgenus: Reptilian type-C viruses
Species: Viper retrovirus
Cornsnake retrovirus

Morphology

Virions exhibit a type-C morphology with a condensed central core and barely visible surface spikes. Capsid assembly and budding occur coordinately at the plasma membrane.

Phylogeny

The viruses are widely distributed, and both endogenous and exogenous viruses exist in a variety of mammals. Many of the viruses in this genus are pathogenic, causing a wide variety of malignancies, immunosuppression, and neurological disorders; many oncogene-containing viruses are known. Insertional activation of cellular oncogenes has been reported (Chapter 10.)

Representative Example

Moloney Murine Leukemia Virus (Mo-MLV)

NCBI GenBank Genome Accession Number: AF033811

Note: The MLV reference sequence corresponds to the MLMCG (J02255) entry. The sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Database and Reference Information

Nucleic acid database locus names and accession numbers:

Protein database locus names and accession numbers:

GAG_MLVMO P03332
POL_MLVMO P03355
ENV_MLVMO P03385

Sequence-related literature references:
Shinnick et al. (1981)
Miller and Verma (1984)

Features of Murine Leukemia Virus

R:(1–68)
U5:(69–145)
PBS:(146–163) The primer for minus-strand DNA synthesis is tRNAPro. Endogenous viruses that use tRNAGln as a primer have been reported.
gag:(621–2235) Proteolytic processing of Pr65 Gag produces MA, CA, and NC (see below). A p12 cleavage product of unknown function is produced from the region between MA and CA. p12 is phosphorylated. Pr65 Gag is myristylated at the 2Gly position and is required for membrane association. MLV produces a glycosylated form of Gag (gPr80) that is initiated upstream of Pr65 Gag at a CUG codon (357). Glycosylation sites are at amino acids 113, 480, and 505. gPr80 Gag is not incorporated into viral particles.
pro:(2223–2597) The pro ORF is in-frame with the gag ORF but is separated by a stop codon at position 2235. The pro ORF is translated as the central region of a Gag-Pro-Pol precursor polyprotein (Pr180 Gag-Pro-Pol). The amino terminus of PR is defined by the NC/PR cleavage and the carboxyl terminus is defined by the PR/RT cleavage.
pol:(2598–5835) The pol ORF is in-frame with the gag and pro ORFs and is translated as the carboxy-terminal region of Pr180 Gag-Pro-Pol. Proteolytic processing produces the mature RT and IN proteins. Integration of MLV produces a 4-bp duplication of flanking cellular DNA; however, the duplication is 5 bp for SNV.
env:(5777–7772) The env ORF is expressed from a spliced subgenomic mRNA and produces a glycosylated precursor, gPr80 Env, that is cleaved by a cellular signal peptidase and cellular furin to produce the mature SU and TM proteins. SU is glycosylated (gp70) but TM is not. SU has five sites for N-linked glycosylation at amino acids 45, 199, 326, 358, 365, 398, and 434. p15E is further cleaved by PR to produce p12E and p2E. This processing is required for fusion of the virion envelope membrane and the host-cell plasma membrane during infection.
PPT:(7803–7815)
U3:(7816–8264) U3 contains enhancers as direct repeats (Chapter 6). Enhancer activity is associated with pathogenicity (Chapter 10).
R:(8265–8332)
Table 6. Type-C Mammalian Retroviruses: Complete Genomes.

Table 6

Type-C Mammalian Retroviruses: Complete Genomes.

Table 7. Type-C Mammalian Retroviruses: Vectors.

Table 7

Type-C Mammalian Retroviruses: Vectors.

Table 8. Type-C Mammalian Retroviruses.

Table 8

Type-C Mammalian Retroviruses.

Map 6: Abelson Murine Leukemia Virus (Ab-MLV) (See Map)

NCBI GenBank Genome Accession Number: AF033812

Note: The Ab-MLV sequence was generated by modifying the MLAPRO (J02009) entry so that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Ab-MLV is a replication-defective virus that is a member of the mammalian type-C retroviral genus. Ab-MLV is closely related to Mo-MLV but has a substantial deletion that spans the 3′half of the gag ORF, all of the pol ORF, and almost all of the env ORF. The P120 strain of Ab-MLV has a 3090-bp insertion that is homologous to the mouse c-abl gene. The P120 strain differs by a deletion of 789 bp from the progenitor P160 strain. A large ORF spans the 5′part of gag and the v-abl sequence. Its translation produces a P120 Gag-Abl fusion protein. Although the Gag MA, pp12, and CA cleavage sites are contained within P120 Gag-Abl, the protein is not cleaved. These sites are shown for reference only. The c-abl gene is a member of the tyrosine kinase gene family and P120 Gag-Abl exhibits tyrosine kinase activity. P120 Gag-Abl is autophosphorylated at Y513. No other viral proteins are produced. Ab-MLV produces leukemias in infected mice.

Map 7: Finkel Biskis-Reilly Murine Sarcoma Virus (FBR-MSV) (See Map)

NCBI GenBank Genome Accession Number: AF033814

Note: The FBR-MSV sequence was generated by modifying the MSVMUSV (K02712) entry so that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

FBR-MSV is a replication-defective virus that is a member of the mammalian type-C retroviral genus. FBR-MSV was derived from a replication-competent virus that has sequence similarity with AKR-MLV. The 3′half of the gag ORF and all of the pol and env ORFs are deleted. They have been replaced by a 1147-bp insertion that was derived from murine c-fos and c-fox. A large ORF spans the 5′ portion of gag and the v-fos sequences. Translation produces a P75 Gag-Fos fusion protein. The v-fox sequences do not contain a significant ORF. P75 Gag-Fos is phosphorylated at undefined serine residues. No other viral proteins are produced. FBR-MSV induces osteosarcomas in infected animals.

Map 8: Moloney Murine Sarcoma Virus (Mo-MSV) (See Map)

NCBI GenBank Genome Accession Number: AF033813

Note: The Mo-MSV sequence was generated by modifying the MLM124 (J02263) entry such that the sequence begins at the transcriptional initiation site and ends at the polyadenylation site (i.e., R to R).

Mo-MSV is a replication-defective virus that is a member of the mammalian type-C retroviral genus. Mo-MSV was derived from Mo-MLV and has a large deletion spanning most of the pol and env ORFs. In place of the deleted pol and env sequences, Mo-MSV has acquired a 1156-bp sequence that is homologous to the murine c-mos gene. The sites of recombination between the c-mos gene and the Mo-MLV sequence are shown in parentheses. The v-mos ORF is expressed using a spliced subgenomic mRNA that uses the Mo-MLV env splice donor and acceptor sites. v-mos translation is initiated using the env initiation site, and the first five amino acids of v-Mos are derived from the env ORF. Mo-MSV retains the complete Mo-MLV gag ORF and cells infected with the Mo-MSV produce Pr65 Gag in the absence of helper virus. c-Mos is a regulator of meiotic maturation.

Table 1 lists a number of mos-containing viruses. Each is thought to have acquired its mos sequence via independent recombination events. Thus, the precise location of the mos sequence, the size and composition of the deletion of the parental viral sequences, and the mode of mos gene expression are all unique. mos-containing viruses produce various sarcomas in infected animals.

Map 9: Type-D Retroviruses (See Map)

Taxonomy

Genus: Type-D retrovirus
Species: Mason-Pfizer monkey virus (M-PMV)
Simian type-D retrovirus 1 (SRV-1)
Simian type-D retrovirus 2 (SRV-2)
Squirrel monkey retrovirus (SMRV) (simian sarcoma virus)
Baboon type-D retrovirus (SRV-Pc)
Bovine pulmonary adenocarcinoma virus (jaagsiekte sheep retrovirus) (JSRV)
Langur virus (LNGV)

Morphology

Virions exhibit a D-type morphology, lacking prominent surface spikes; viral cores assemble in the cytoplasm before migrating to and budding from the plasma membrane.

Phylogeny

The type-D retroviral genus comprises both endogenous and exogenous viruses of New and Old World primates and sheep. These viruses are associated with immunodeficiencies in primates and pulmonary cancer in sheep. No oncogene-containing viruses are known, and insertional activation of cellular oncogenes has not been reported.

Representative Example

Mason-Pfizer Monkey Virus (M-PMV)

NCBI Genbank Genome Accession Number: AF033815

Note: The MPMV reference sequence was generated by modifying the SIVMPCG (M12349) entry such that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Database and Reference Information

Nucleic acid database locus names and accession numbers:

M-PMV  SIVMPCG  M12349
SRV-1  SIVRV1CG  M11841

Protein database locus names and accession numbers:

M-PMV:gagGAG_MPMVP07567
proVPRT_MPMVP07570
polPOL_MPMVP07572
envENV_MPMVP07575
SRV-1:gagGAG_SRV1P04022
proVPRT_SRV1P04024
polPOL_SRV1P04025
envENV_SRV1P04027

Sequence-related literature references:
 M-PMV: Sonigo et al. (1986)
 SRV-1: Power et al. (1986)

Complete Genomes

Retrovirus nameLocus nameAcc. no.
Simian Mason-Pfizer type-D retrovirus (M-MPMV/6A)SIVMPCG M1234
Simian SRV-1 type-D retrovirus (SRV-1)SIVRV1CG M11841
Simian SRV-2 type-D retrovirus (SRV-2)SIV2DCG M16605
Simian sarcoma virus (SMRV-HLB) “squirrel monkey retrovirus”PCSLTRA M23385
Jaagsiekte sheep retrovirus “bovine pulmonary adenocarcinoma virus”JSRCG M80216

Features of Mason-Pfizer Monkey Virus

R:(1–25)
U5:(26–122)
PBS:(123–139) The primer for minus-strand DNA synthesis is tRNALys1,2.
gag:(269–2240) Proteolytic processing of Pr78 Gag produces MA, CA, and NC. Two cleavage products (p24, p12) of unknown function are produced from the region between MA and CA. A small cleavage product (p4) of unknown function is produced from the carboxyl terminus of Pr78 Gag. Pr78 Gag is myristylated at the 2Gly position.
pro:(2092–3001) The pro ORF is expressed by using a –1 frameshift (2092) producing a Gag-Pro polyprotein precursor (Pr95 Gag-Pro). The 3′region of the gag ORF together with the 5′region of the pro ORF encodes dUTPase (DU). DU is not required for viral replication. PR (p17) is encoded in the 3′region of the pro ORF and is further cleaved to p12 and p5.
pol:(3000–5579) The pol ORF is expressed by using two –1 frameshifts (2092 and 2891) producing a Gag-Pro-Pol polyprotein precursor (Pr180 Gag-Pro-Pol). The location of the cleavage site between RT and IN has not been reported.
env:(5621–7379) The env-encoded proteins are expressed using a spliced subgenomic mRNA. Translation of the env ORF produces a polyprotein precursor (gPr86 Env) that is cleaved to the mature SU and TM proteins by a cellular signal peptidase (SP/SU) and cellular furin (SU/TM). TM (gp22) is further cleaved at its carboxyl terminus by PR producing gp20 and p2. Both the SU and TM subunits are glycosylated. SU has ten predicted sites for N-linked glycosylation at amino acids 120, 237, 264, 276, 291, 304, 318, 324, 339, and 357; TM has one site at amino acid 487.
PPT: (7573–7583)
U3:(7584–7810)
R:(7811–7835)

Map 10: Spumavirus (See Map)

Taxonomy

Genus: Spumavirus
Species: Human foamy virus (HFV or HSRV)
Simian foamy virus (SFV)
Feline syncytial virus (FSV)
Bovine syncytial virus (BSV)
Hamster syncytial virus (HaSFV)
Sea lion syncytial virus

Morphology

Spumaviruses have a distinctive morphology with prominent surface spikes. Capsid assembly occurs in the cytoplasm. Spumavirus assembly is unusual in that virions acquire their envelope membrane by budding through the endoplasmic reticulum. Spumavirus virions contain significant amounts of double-stranded full-length DNA, suggesting that reverse transcription occurs prior to infection.

Phylogeny

The spumavirus genus consists of exogenous viruses with a wide distribution in mammals. No endogenous members are known, although related human endogenous retrovirus pol gene sequences have been reported. Pathogenicity of spumaviruses has not been demonstrated. No oncogene-containing members are known, and insertional activation of cellular oncogenes has not been reported.

Representative Example

Human Foamy Retrovirus (HFV or HSRV)

NCBI Genbank Genome Accession Number: AF033816

Note: Human foamy virus (HFV) is now believed to be derived by rare zoonotic infection of humans by a chimpanzee spumavirus (CFV). A complete genome was generated by linking the HSPGAGPOL sequence (positions 1-5340) to the HSPENVPOL sequence (positions 7-6754). Sequence corrections reported in Netzer et al. (1993) and Flügel et al. (1990) were included.

Database and Reference Information

Nucleic acid database locus names and accession numbers:

HSPGAGPOL M19427
HSPENVPOLM54978
RESPUENVX05591
RESPULTRX05591

Protein database locus names and accession numbers:

gag GAG FOAMVP14344
pol POL FOAMVP14350
env ENV FOAMVP14351
tas S13140
bet S44249

Sequence-related literature references:
Flügel et al. (1990)
Maurer et al. (1988)
Netzer et al. (1993)

Features of Human Foamy Virus

R:(1–191)
U5:(192–346)
PBS:(347–364) The primer for minus-strand DNA synthesis is tRNALys1,2
gag:(446–2390) The Pr74 Gag precursor is not cleaved into MA, CA, and NC proteins as it is in other retroviruses. Instead, proteolytic processing occurs near the carboxyl terminus of Pr78 Gag producing p74 and p3 proteins (R. Flügel, pers. comm.). The amino terminus of Pr74 Gag is not myristylated, but based on sequence similarity to the RSV sequence, it may be acetylated.
pro and pol:(2340–5769) The Pr125 Pro-Pol precursor is expressed from a spliced subgenomic mRNA (Yu et al. 1996). Proteolytic processing produces a p85 PR-RT product and a p40 IN protein. It is not known whether p85 PR-RT is processed further into separate PR and RT proteins. The RT/IN cleavage site has not been reported.
env:(5731–8686) Translation of a subgenomic env mRNA produces the gPr130 Env polyprotein precursor. The signal peptide cleavage site is not known. Both SU and TM envelope proteins are glycosylated. SU has 12 predicted sites for N-linked glycosylation at amino acids 21, 105, 137, 179, 282, 307, 342, 387, 401, 419, 524, and 553; TM has three sites at amino acids 799, 805, and 830. The envelope region contains the transcription initiation site (8419) for two subgenomic mRNAs that are used to express the tas and bet ORFs.
tas:(8658–9558) Tas (transcriptional activator of spumavirus) was formerly known as Bel1. Tas is expressed from a singly spliced subgenomic mRNA that is initiated within the env region (8419). The 5′region of the tas ORF also encodes the amino-terminal region of the Bet protein.
bet:(8658–8922, 9224–10,405) The central and carboxy-terminal regions of the Bet protein are encoded in the reading frame formerly known as bel2. Bet is expressed from a doubly spliced subgenomic mRNA that is initiated within the env region (8419). Bet is phosphorylated. It is abundantly expressed but its function is unknown.
PPT:(10,031–10,052)
U3:(10,053–10,968) The U3 regions of spumaviruses are relatively long (∼900–1400 bp). The HFV 3′LTR U3 region of the HSPENVPOL sequence that was used to assemble the composite HFV sequence represented here contains an insertion of 137 bp relative to the corresponding 5′LTR U3 region.
R:(10,969–11,160)

Map 11: Human T-Cell Leukemia Virus (See Map)

Taxonomy

Genus: HTLV/BLV
Species: Human T-cell leukemia virus type 1 (HTLV-1)
Human T-cell leukemia virus type 2 (HTLV-2)
Simian T-cell leukemia virus type 1 (STLV-1)
Bovine leukemia virus (BLV)

Morphology

HTLV/BLV virion morphology is similar to that of type-C viruses.

Phylogeny

Only exogenous viruses of humans, primates, and cattle are known within this genus. Infections are associated with B- and T-cell leukemias and lymphomas and neurological disease. No oncogene-containing members are known and insertional activation of cellular oncogenes has been reported, but only rarely.

Representative Example

Human T-cell Leukemia Virus Type 1 (HTLV-1)

NCBI Genbank Genome Accession Number: AF033817

Note: The HTLV-1 reference sequence was generated by modifying the HTVPRCAR entry such that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R). The HTVPRCAR entry is a partial sequence and ends in the U3 region of the 3′LTR; therefore, a composite sequence (R to R) was created by linking positions 354–8400 and 122–181 of the HTVPRCAR sequence. The nucleotide sequence of an infectious clone of HTLV-1 has not been reported, although an infectious clone of HTLV-1 has been described (pCS-HTLV; Derse et al. [1995]).

Database and Reference Information

Nucleic acid database locus names and accession numbers:

HTLV-1:HTVPRCARD13784
RE1PROPJ02029
HTLV-2:HTLV2M10060
HL2V2CGM10060
HL2VG12GNOML11456
HL2IIENVGAL20734
STLV-1:STLVITE4Z46900

Protein database locus names and accession numbers:

Complete Genomes

HTVPRCARRE1PROP
gag 221867GAG_HTL1A P03345
pro 221868VPRT_HTL1A P10274
pol 221869POL_HTL1A P03362
env 221870ENV_HTL1A P03381
tax TAT_HTL1A P03409

Sequence-related literature references:
Malik et al. (1988)
Seiki et al. (1983)
Derse et al. (1995)

Features of Human T-cell Leukemia Virus

R:(1–228) All characterized members of the HTLV-1/BLV genus have relatively long R regions which include the RexRE element.
U5:(229–405)
PBS:(406–423) The primer for minus-strand DNA synthesis is tRNAPro.
gag:(450–1737) Proteolytic processing of Pr55 Gag produces MA, CA, and NC proteins. Pr55 Gag is myristylated at position 2Gly.
pro:(1718–2402) The pro ORF is expressed by using a –1 frameshift (1718) producing a Gag-Pro polyprotein. The cleavage site for PR has not been reported.
pol:(2245–4834) The pol ORF is expressed by using two –1 frameshifts (1718,2245) producing a Gag-Pro-Pol polyprotein. The locations of the amino-terminal cleavage sites producing the mature RT and IN proteins have not been reported. Cleavage of Pr Gag-Pro-Pol at the carboxyl terminus of PR produces a p95 cleavage product containing the RT and IN domains.
env:(4829–6293) The env ORF is expressed from a spliced subgenomic mRNA. Both the SU and TM proteins are glycosylated. SU has four predicted sites for N-linked glycosylation at amino acids 140, 222, 244, and 272; TM has one site at amino acid 404.
rex:(4773–4832, 6951–7458) The Rex protein is produced from a doubly spliced tax/rex mRNA. The larger Rex protein is produced from an ORF that spans exons 2 and 3. Translation initiated in exon 2 (4773) produces pp27 Rex. Translation of the smaller Rex protein is initiated in exon 3 and produces pp21 Rex. Both Rex proteins are phosphorylated.
tax:(4829–4832, 6951–8006) The Tax protein is produced from a doubly spliced tax/rex mRNA. The tax ORF spans exons 2 and 3. Translation, which is initiated in exon 2, proceeds one codon (four nucleotides) before crossing the splice junction to exon 3.
PPT:(7910–7926)
U3:(7927–8279) The U3 region contains the Tax recognition sequences. Tax interacts with cellular factors at these sites in the 5′LTR of the provirus to activate viral transcription.
R:(8280–8507) The polyadenylation signal sequence is not located ∼20 bp upstream from the site of polyadenylation, as in most other retroviruses. Rather, the sequence is located more than 200 bp upstream of the site of polyadenylation and the two are brought into association by RNA secondary structure (see Chapter 6). The RexRE sequence is located in R and this sequence, in the 3′R region of genomic and env RNA, interacts with the Rex protein, increasing the transport of these RNAs to the cytoplasm.

Map 12: Bovine Leukemia Virus (See Map)

Representative Example

Bovine Leukemia Virus (BLV)

NCBI Genbank Genome Accession Number: AF033818

Note: The BLV reference sequence was assembled using the following sequences: BLVGAGA (M10987) 1–4555, BLVENV (K02251) 1–3686, and BLVLTR3 (K01617) 522–700. The reference sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Nucleic acid database locus names and accession numbers:

Complete genome:BLVCGK02120
5′LTR, gag, pol: BLVGAGAM10987
5′LTR, gag, pol, env: BLVGPED00647
env, pX-BL, 3′LTR:BLVENVK02251
3′LTRBLVLTR3K01617

Protein database locus names and accession numbers:

gag S29356
GAG_BLVJP03344
GAG_BLVAUP25058
pro VPRT_BLVJP10270
pol S29358
POL_BLVJP10270
POL_BLVAUP25059
env BLVENV_2K02251
ENV_BLVJP03380
ENV_BLVAUP25057

Sequence-related literature references:
Sagata et al. (1985)
Rice et al. (1984, 1985)

Features of Bovine Leukemia Virus

R:(1–229) Viruses belonging to the HTLV/BLV genus have the longest R regions of all known retroviruses.
U5:(230–322)
PBS:(323–341) The primer for minus-strand DNA synthesis is tRNAPro.
gag:(418–1597) Proteolytic processing of Pr44 Gag produces MA, CA, and NC. p15MA is further processed to p10 MA, p4, and a small peptide from the region between p10 and p4. The functions of p4 and this small peptide are not known.
pro:(1596–2133) The pro ORF is expressed by using a –1 frameshift (1596) producing a Gag-Pro precursor protein (Pr66 Gag-Pro). Proteolytic processing of Pr66 Gag-Pro releases p14PR. PR cleavage generates peptides of unknown function from the region between the NC and PR domains and the carboxyl terminus of Pr66 Gag-Pro.
pol:(2132–4667) The pol ORF is expressed by using two –1 frameshifts (1596 and 2132) producing a Gag-Pro-Pol polyprotein (Pr145 Gag-Pro-Pol). The cleavage sites that produce RT and IN have not been reported.
env:(4615–6160) The env ORF is expressed from a spliced subgenomic mRNA producing gPr72 Env. Both SU and TM proteins are glycosylated. SU has eight predicted sites for N-linked glycosylation at amino acids 67, 129, 203, 230, 251, 256, 271, and 287. TM has two predicted sites of glcosylation at amino acids 351 and 398.
rex:(4615–4665, 7042–7459) The BLV Rex protein is produced from a doubly spliced tax/rex mRNA. The rex ORF spans exons 2 and 3. Rex translation proceeds 17 codons in exon 2 before crossing the splice junction. pp18 Rex is phosphorylated.
tax:(4662–4665, 7042–7965) The BLV Tax protein is produced from a doubly spliced tax/rex mRNA. Translation initiates in the 3′region of exon 2 and proceeds one codon before crossing the splice junction.
PPT:(7968–7979)
U3:(7980–8190) The U3 region contains the Tax recognition sequences. Tax interacts with cellular factors at these sites in the 5′LTR of the provirus to activate viral transcription.
R:(8191–8419) The polyadenylation signal sequence is not located ~20 bp upstream from the site of polyadenylation, as in most other retroviruses. Rather, the sequence is located more than 200 bp upstream of the site of polyadenylation and the two are brought into association by secondary RNA structure (see Chapter 6). The RexRE sequence is located in R and this sequence, in the 3′R region of genomic and env RNA, interacts with the Rex protein, increasing the transport of these RNAs to the cytoplasm.

Map 13: The Fish Retrovirus Group (See Map)

Taxonomy

Complex retroviruses have been isolated from several fish species. The distantly related fish retroviruses have not received formal taxonomic classification and their presentation as a group in this Appendix is not intended to suggest a taxonomic relatedness.

Morphology

Typical C-type particles approximately 150 nm in diameter.

Phylogeny

The distinct genetic organization of the walleye dermal sarcoma virus (WDSV) and snakehead fish retrovirus (SnRV), as deduced from the nucleic acid sequence, appears to exclude these viruses from the seven formally recognized retroviral genera. Phylogenetic analysis indicates that both of these viruses are similar in nucleic acid sequence to the mammalian type-C retroviruses (Chapter 2). However, comparison of the nucleic acid sequences implies a relatively distant relationship, a conclusion reinforced by the presence of additional cyclin-related ORF(s) downstream from the env ORF, and distinct tRNA primer-binding-site specificities (WDSV: tRNAHis; SnRV: tRNAArg1,2). WDSV is unique among all known retroviruses in that it contains an ORF in the region between U5 and gag. The WDSV genome is the largest now known (approximately 12 kb).

Pathogenesis: WDSV infection produces seasonal lesions that arise in winter and disappear in summer.

Representative Example

Walleye Dermal Sarcoma Virus (WDSV)

NCBI Genbank Genome Accession Number: AF033822

Note: The WDSV reference sequence was generated by modifying the TYCGAG (L41838) entry such that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Database and Reference Information

Nucleic acid database locus names and accession numbers:

Walleye dermal sarcoma virusTYCGAGL41838
Walleye epidermal hyperplasia virus type 1AF014792
Walleye epidermal hyperplasia virus type 2AF014793
Snakehead fish retrovirusSRU26458U26485

Protein database locus names and accession numbers:

orf cTYCGAG_1L41838
gag TYCGAG_2L41838
pol
env TYCGAG_3L41838
orf aTYCGAG_4L41838
orf bTYCGAG_5L41838

Sequence-related literature references:

Features of the Walleye Dermal Sarcoma Virus

R:(1–77)
U5:(78–153)
PBS:(154–171) The primer for minus-strand DNA synthesis is tRNAHis.
c:(430–790) A small ORF capable of encoding a 120-amino-acid protein is located upstream (5′) of the gag ORF.
gag:(800–2545) Translation of the WDSV gag ORF produces a precursor polyprotein that is cleaved by PR to produce the mature MA, CA, and NC proteins. This processing also generates a p20 protein from the region between MA and CA. p20 is further cleaved to generate two p10 peptides of unknown function. The Gag polyprotein is myristylated at the 2Gly position.
pro and pol:(2546–6056) WDSV pro and pol gene expression resembles the mammalian type-C retroviruses in that the pro and pol ORFs are expressed by a nonsense suppression of the gag termination codon (2546), resulting in the production of a Gag-Pro-Pol precursor polyprotein. Only the PR/RT cleavage site has been reported.
env:(5974–9649) The WDSV env ORF is expressed using a singly spliced mRNA. The Env polyprotein is relatively large; it contains about twice the number of amino acids in the Env proteins of other retroviruses. This is due to the large TM domain (gp90). Both the SU and TM proteins are glycosylated. SU has nine predicted sites for N-linked glycosylation at amino acids 50, 95, 109, 134, 148, 232, 254, 304, and 323. TM has sites at amino acids 621, 664, 722, and 771.
a:(9674–10,565) WDSV contains an additional ORF (ORF a) located immediately downstream (3′) from env that potentially encodes a 297-amino-acid protein. Although its function has not been established, the deduced amino acid sequence is similar to members of the cyclin gene family.
b:(10,570–11488) WDSV contains an additional ORF (ORF b) located between ORF a and U3 of the 3′LTR. ORF b potentially encodes a 306-amino-acid protein of unspecified function. The deduced amino acid sequence of ORF b is also similar to members of the cyclin gene family.
U3:(11,674–12,118)
R:(12,119–12,195)

Map 14: Gypsy Retrovirus (See Map)

Taxonomy

Until recently, the endogenous gypsy sequences of the fruit fly were classified among the retrotransposon group of transposable elements. Gypsy is now known to be a true retrovirus. However, it is yet to be formally classified.

Morphology

Virions have not yet been described.

Phylogeny

Gypsy is an endogenous retrovirus found in fruit flies (Drosophila sp.). It has been detected by its ability to infect fly embryos, causing insertional inactivation of certain genes (see Chapter 8). It represents the first example of a true retrovirus in invertebrates.

Representative Example

Gypsy Retrovirus

NCBI Genbank Genome Accession Number: AF033821

Note: The Gypsy retrovirus reference sequence was generated by modifying the DROGYPF1A (M12927) entry such that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Database and Reference Information

Nucleic acid database locus names and accession numbers:

Drosophila melanogaster:
 gypsyDROGYPF1AM12927
Drosophila virilis:
 gypsyDROGYPSYZM38438

Protein database locus names and accession numbers:

Drosophila melanogaster
gagGAGY_DROMEP10405
polPOLY_DROMEP10401
envS52567
Drosophila virilis
gagS26839
polS26840
envS26841

Sequence-related literature references:
Marlor et al. (1986)
Mizrokhi and Mazo (1991)
Avedisov and Ilyin (1994)
Pélisson et al. (1994)

Features of Gypsy Retrovirus

R:(1–53)
U5:(54–243)
PBS:(244–255) The primer for minus-strand DNA synthesis is tRNALys.
gag:(843–2196) The Gypsy gag does not exhibit identifiable nucleic acid or protein sequence similarity with other retroviral gag ORFs. However, based on size and location, it is believed to encode the viral core proteins. Gag cleavage sites have not been reported. Modification of the amino terminus of the Gag precursor polyprotein has not been reported.
pro and pol:(2144–5231) The Gypsy pro and pol ORFs are expressed as a Gag-Pro-Pol polyprotein that is produced by a –1 frameshift near the carboxyl terminus of the gag ORF (2144). Proteolytic cleavage sites defining the amino-terminal and carboxy-terminal boundaries of the PR, RT, and IN proteins have not been reported.
env:(330–331, 5314–6761) The Gypsy env ORF does not exhibit identifiable nucleic acid or protein sequence similarity with other retroviral env ORFs. However, based on its size, location, and proteolytic processing, it is believed to encode the viral SU and TM proteins. The Gypsy env ORF is expressed from a singly spliced mRNA and produces the gPR66 Env precursor polyprotein. The env translation initiation codon is generated by formation of the splice junction. Both the signal peptide/SU and SU/TM cleavage sites have been identified. Both SU and TM are glycosylated. SU has two predicted sites for N-linked glycosylation at amino acids 47 and 200; TM has one site at amino acid 366. The carboxy-terminal portion of the env ORF lies within the U3 region of the 3′LTR.
PPT:(6740–6749)
U3:(6750–6987) The U3 region of the 3′LTR encodes the carboxy-terminal domain of Pr66 Env.
R: (6988–7040)

Map 15: Human Immunodeficiency Virus Type 1 (See Map)

Taxonomy

Genus: Lentivirus
Species: Primate lentiviruses: Human immunodeficiency virus types 1 and 2* (HIV-1/HIV-2)
Simian immunodeficiency virus
 Chimpanzee (SIVcpz)
 Sooty mangabey (SIVsmm)
 African green monkey (SIVagm)
 Syke's monkey (SIVsyk)
 Mandrill (SIVmnd)
 Macaque (SIVmac)*
Feline lentiviruses: Feline immunodeficiency virus (FIV)
Bovine lentiviruses: Bovine immunodeficiency virus (BIV)
Ovine lentiviruses: Maedi/Visna virus (MVV); Caprine arthritis encephalitis virus (CAEV)
Equine lentiviruses: Equine infectious anemia virus (EIAV)

The five groups of lentiviruses are recognized based on serology and genome organization and these groups correspond to the host ranges of the viruses. All lentiviruses express at least two additional regulatory proteins (Tat, Rev) in addition to Gag, Pol, and Env proteins. The primate lentiviruses produce other accessory proteins (Nef, Vpr, Vpu, Vpx, Vif); counterparts of these proteins have not been definitively identified in other lentivirus groups. The pol genes of some nonprimate lentiviruses (EIAV, FIV) encode a protein with dUTPase activity (Chapter 4).

Morphology

The cores of mature virions are shaped like truncated cones. Viral particles assemble at, and bud from, the plasma membrane.

Phylogeny

Lentiviruses include exogenous viruses of humans, primates, domestic cats, and a variety of livestock (sheep, cattle, horses). No isolates from rodents have been described. No closely related endogenous viruses have been described. Lentiviruses are the causative agents of a variety of diseases, including immunodeficiencies, neurological degeneration, and arthritis. No oncogene-containing members have been reported and integration is not known to activate cellular oncogenes.

Representative Example

Human Immunodeficiency Virus Type 1 (HIV-1)

NCBI Genbank Genome Accession Number: AF033819

Note: The HIV-1 reference sequence was generated by modifying the HIVHXB2CG (K03455) entry such that the sequence begins at the transcription initiation site and ends at the polyadenylation site (i.e., R to R).

Sequence-related literature references:
Ratner et al. (1987)

Features of Human Immunodeficiency Virus Type 1

R:(1–96) In viral RNA, the R region contains the trans-activator response region (TAR). The TAR RNA sequence forms a stable hairpin structure.
U5:(97–181)
PBS:(182–199) The primer for minus-strand DNA synthesis is tRNALys3.
gag:(336–1836) The gag region encodes Pr55 Gag. Pr55 Gag is myristylated at the 2Gly position. Proteolytic processing cleaves p17 MA, p24 CA, p7 NC, and p2 from Pr55 Gag.
pro:(1637–2099) The pro ORF is translated from unspliced genomic RNA by ribosomal frameshifting at position 1637. Translation produces a Pr160 Gag-Pro-Pol precursor polyprotein. PR cleaves p10 PR from the central region of Pr160 Gag-Pro-Pol. The structure of HIV-1 PR has been extensively characterized (see Table 3 and Chapter 7).
pol:(2102–4640) The pol ORF is in-frame with the pro ORF; pol gene products are synthesized as part of the Pr160 Gag-Pro-Pol polyprotein. RT is a p66/p51 heterodimer, the two subunits of which differ by the presence or absence of most of the RNase H domain. p32 IN is cleaved from the carboxy-terminal region of PR160 Gag-Pro-Pol. Integration of HIV-1 DNA produces a 5-bp duplication of flanking cellular DNA.
vif:(4587–5163) The p23 Vif protein is translated from a singly spliced mRNA.
vpr:(5105–5339) The p15 Vpr protein is translated from a singly spliced mRNA. Vpr is found at the inner face of the cell membrane in infected cells. Large amounts of Vpr are packaged in virions, although it is not a structural protein.
tat:(5377–5591, 7925–7968) The p14 Tat protein is translated from multiply spliced mRNAs. Tat is localized in the nucleus of infected cells and activates viral transcription by binding to the TAR region of R. Tat is active as a homodimer and is phosphorylated in its carboxy-terminal region, although the function of this modification is not known.
rev:(5516–5591, 7925–8197) The rev ORF spans exons 2 and 3 of a multiply spliced mRNA and encodes the Rev protein (p19). Rev localizes in the nucleus and is phosphorylated on serine residues. Rev binds the RRE (bases 7315–7548) present in intron-containing RNAs, faciliating mRNA transport to the cytoplasm (see Chapter 6).
vpu:(5608–5854) The vpu ORF is expressed from a weak initiation (AUG) codon upstream of the env AUG and encodes p16 Vpu. Vpu is phosphorylated on serine residues and localizes to the plasma cell membrane, but is not found in virions.
env:(5771–8339) The env ORF encodes the gPr160 Env polyprotein precursor that is processed by cellular proteases to generate mature gp120 SU and gp41 TM. The SU protein contains alternating conserved (C) and variable (V) regions (Chapter 3). SU mediates HIV-1 adsorption through the coordinate binding to host cell CD4 molecules and to one of several host cell coreceptors. Although the env genes of HIV-1 have been extensively sequenced, the only known HIV env structure is the coiled-coil region of TM. The SU/TM pair is thought to form trimers on the virion membrane. There are 24 sites for N-linked glycosylation in SU (88, 136, 141, 156, 160, 186, 197, 230, 234, 241, 262, 276, 289, 295, 301, 332, 339, 356, 386, 392, 397, 406, 448, and 463). There are seven glycosylation sites in TM (611, 616, 624, 637, 674, 750, and 816).
nef:(8343–8710) The nef ORF lies immediately downstream from the env ORF and extends into the U3 region of the 3′ LTR. p27 Nef is translated from a multiply spliced subgenomic mRNA. It undergoes posttranslational myristylation at the 2Gly position which helps to localize Nef to the inner aspect of the cell membrane. Nef forms homodimers and is phosphorylated at 15Tyr.
PPT:(8615–8630) The PPT serves as the principal primer for plus-strand DNA synthesis. There is an additional polypurine tract near the middle of the HIV-1 genome that serves as a secondary site for the initiation of plus-strand DNA synthesis (Chapter 4).
U3:(8631–9085)
R:(9086–9181)

LENTIVIRUSES

Nucleic acid database
locus namesacc. no.
HIV-1HIVHXB2CG K03455
HIVNL43 M19921
HIVBRUCG K02013
HIVNY5CG M38431
HIVJRCCSF M38429
HIVSF2CG K02007
HIVMNCG M17449
HIV-2HIV2BEN M30502
HIV2D194 J04542
HIV2GH1 M30895
HIV2ISY J04498
HIV2ROD M15390
HIV2ST M31113
HIV2UC1GNM L07625
SIVSIVAGM155 M29975
SIVAGM3 M30931
SIVAGM677A M58410
SIVAGMAA M66437
SIVCOMGNM L06042
SIVMM239 M33262
SIVMM251 M19499
SIVMNE M32741
SIVSMMPBJA M31345
SIVSMMPBJB L03295
FIVFIVCG M25381
FIVPPR M36968
FIU11820 U11820
BIVBIM127 M32690
EIAVEIAVCG M16575
EIACGIP M87581
EIU01866 U01866
VisnaVLVCG M10608
VLVCGA M51543
VLVGAGA L06906
VLVLV1A M60609
VLVLV1B M60610
CAEVCAEVCG M33677
Ovine LentivirusOLVCG M31646
OLVSAOMVCG M34193

LENTIVIRUSES

Protein database
locus namesacc. no.
HIVHXB2 gag GAG_HV1H2 P04591
pol POL_HV1H2 P04585
env ENV_HV1H2 P04578
vif VIF_HV1B1 P03401
vpr VPR_HV1B1 P05926
vpu VPU_HV1H2 P05919
tat TAT_HV1H2 P04608
rev REV_HV1H2 P04618
nef NEF_HV1H2 P04601

References

  1. August J.T., Bolognesi D.P., Fleissner F., Gilden R.V., Nowinski R.C. A proposed nomenclature for the virion proteins of oncogenic RNA viruses. Virology. 1974;60:595–601. [PubMed: 4135911]
  2. Avedisov A.N., Ilyin Y.V. Identification of spliced RNA species of Drosophila melanogaster gypsy retrotransposon: New evidence for retrovirual nature of the gypsy element. FEBS Lett. 1994;350:147–150. [PubMed: 8062915]
  3. Bieth E., Darlix J.L. Complete nucleotide sequence of a highly infectious avian leukosis virus. Nucleic Acids Res. 1992;20:367. [PMC free article: PMC310381] [PubMed: 1311072]
  4. Coffin J.M. 1992Structure and classification of retrovirusesIn The retroviridae (ed. J.A. Levy), pp. 19-50. Plenum Press, New York.
  5. Coffin J.M., Varmus H.E., Bishop J.M., Essex M., Hardy W.D., Martin G.S., Rosenberg N.E., Scolnick F.M., Weinberg R.A., Vogt P.K. A proposal for naming host cell-derived inserts in retrovirus genomes. J. Virol. 1981;40:953–957. [PMC free article: PMC256709] [PubMed: 7321107]
  6. Coffin J., Haase A., Levy J.A., Montagnier L., Oroszlan S., Teich N., Temin H., Toyoshima K., Varmus H., Vogt P., Weiss R. Human immunodeficiency viruses. Science. 1986;232:697. [PubMed: 3008335]
  7. Derse D., Mokovits J., Polianova M., Felber B.K., Ruscetti F. Virions released from cells transfected with a molecular clone of human T-cell leukemia virus type 1 give rise to primary and secondary infections of T cells. J. Virol. 1995;69:1970–1912. [PMC free article: PMC188805] [PubMed: 7853532]
  8. Flügel R.M., Rethwilm A., Maurer B., Darai G. Nucleotide sequence analysis of the env gene and its flanking regions of the human spumaretrovirus reveals two novel genes (erratum in EMBO J. 9: 3806) EMBO J. 1990;6:2077–2084. [PMC free article: PMC553598] [PubMed: 2820721]
  9. Holzschu D.L., Martineau D., Fodor S.K., Vogt V.M., Bowser P.R., Casey J.W. Nucleotide sequence and protein analysis of a complex piscine retrovirus, walleye dermal sarcoma virus. J. Virol. 1995;69:5320–5331. [PMC free article: PMC189371] [PubMed: 7636975]
  10. Leis J., Baltimore D., Bishop J.M., Coffin J.M., Fleissner E., Goff S.P., Oroszlan S., Robinson H., Skalka A.M., Temin H.M., Vogt P.K. Standardized and simplified nomenclature for proteins common to all retroviruses. J. Virol. 1988;62:1808–1809. [PMC free article: PMC253234] [PubMed: 3357211]
  11. Malik K.T., Even J., Karpas A. Molecular cloning and complete nucleotide sequence of an adult T cell leukemia virus/human T cell leukemia virus type I (ATLV/HTLV-1) isolate of Caribbean origin: Relationship to other members of the ATLV/HTLV-I subgroup. J. Gen. Virol. 1988;69:1695–1710. [PubMed: 2899128]
  12. Marlor R.L., Parkhurst S.M., Corces V.C. The Drosophila melanogaster gypsy transposable element encodes putative gene products homologous to retroviral proteins. Mol. Cell Biol. 1986;6:1129–1134. [PMC free article: PMC367623] [PubMed: 3023871]
  13. Maurer B., Bannert H., Darai G., Flügel R.M. Analysis of the primary structure of the long terminal repeat and the gag and pol genes of the human spumaretrovirus. J. Virol. 1988;62:1590–1597. [PMC free article: PMC253186] [PubMed: 2451755]
  14. Miller A.D., Verma I.M. Two base changes restore infectivity to a noninfectious molecular clone of Moloney murine leukemia virus (pMLV-1) J. Virol. 1984;49:214–222. [PMC free article: PMC255444] [PubMed: 6197537]
  15. Mizrokhi L.J., Mazo A.M. Cloning and analysis of the mobile element gypsy from D. virilis. Nucleic Acids Res. 1991;19:913–916. [PMC free article: PMC333730] [PubMed: 1708127]
  16. Moore R., Dixon M., Smith R.E., Peters G., Dickson C. Complete nucleotide sequence of a milk-transmitted mouse mammary tumor virus: Two frameshift suppression events are required for translation of gag and pol. J. Virol. 1987;61:480–490. [PMC free article: PMC253972] [PubMed: 3027377]
  17. Murphy F.A., Fauquet C.M., Bishop D.H.L., Ghabrial S.A., Jarvis A.W., Martelli G.P., Mayo M.A., Summers M.D. 1995 Virus taxonomy: Sixth report of the International Committee on the Taxonomy of Viruses Springer-Verlag, New York.
  18. Myers G., Korber B., Hahn B.H., Jeang K.-T., Mellors J.W., McCutchen F.E., Henderson L.E., Pavlakis G.N. 1995 Human retroviruses and AIDS 1995 Los Alamos National Laboratory, Los Alamos, New Mexico. [PMC free article: PMC41633]
  19. Netzer K.-O., Schliephake A., Maurer B., Watanabe R., Aguzzi A., Rethwilm A. Identification of pol-related gene products of human foamy virus. Virology. 1993;192:336–338. [PubMed: 8390761]
  20. Pélisson A., Song S.U., Prud'homme N., Smith P.A., Bucheton A., Corces V.G. Gypsy transposition correlates with the production of a retroviral envelope-like protein under the tissue-specific control of the Drosophila flamenco gene. EMBO J. 1994;13:4401–4411. [PMC free article: PMC395367] [PubMed: 7925283]
  21. Power M.D., Marx P.A., Bryant M.L., Gardner M.B., Barr P.J., Luciw P.A. Nucleotide sequence of SRV-1, a type D simian acquired immune deficiency syndrome retrovirus. Science. 1986;231:1567–1572. [PubMed: 3006247]
  22. Ratner L., Fisher A., Jagodzinski L.L., Mitsuya H., Liou R.S., Gallo R.C., Wong-Staal F. Complete nucleotide sequences of functional clones of the AIDS virus. AIDS Res. Hum. Retroviruses. 1987;3:57–69. [PubMed: 3040055]
  23. Reddy P.E., Smith M.J., Srinivasan A. Nucleotide sequence of Ableson murine leukemia virus genome: Structural similarity of its transforming gene product to other one gene products with tyrosine-specific kinase activity. Proc. Natl. Acad. Sci. 1983;80:3623–2672. [PMC free article: PMC394102] [PubMed: 6304726]
  24. Rice N.R., Stephens R.M., Burny A., Gilden R.V. The gag and pol genes of bovine leukemia virus: Nucleotide sequence and analysis. Virology. 1985;142:357–377. [PubMed: 2997990]
  25. Rice N.R., Stephens R.M., Couez D., Deschamps J., Kettmann R., Burny A., Gilden R.V. The nucleotide sequence of the env gene and post-env region of bovine leukemia virus. Virology. 1984;138:82–93. [PubMed: 6093363]
  26. Sagata N., Yasunaga T., Tsuzuku-Kawamura J., Ohishi K., Ogawa Y., Ikawa Y. Complete nucleotide sequence of the genome of bovine leukemia virus: Its evolutionary relationship to other retroviruses. Proc. Natl. Acad. Sci. 1985;82:677–681. [PMC free article: PMC397108] [PubMed: 2983308]
  27. Schwartz D.E., Tizard R., Gilbert W. Nucleotide sequence of Rous sarcoma virus. Cell. 1983;32:853–869. [PubMed: 6299578]
  28. Seiki M., Hattori S., Hirayama Y., Yoshida M. Human adult T-cell leukemia virus: Complete nucleotide sequence of the provirus genome integrated in leukemia cell DNA. Proc. Natl. Acad. Sci. 1983;80:3618–3622. [PMC free article: PMC394101] [PubMed: 6304725]
  29. Shinnick T.M., Lerner R.A., Sutcliffe J.G. Nucleotide sequence of Moloney murine leukemia virus. Nature. 1981;293:543–548. [PubMed: 6169994]
  30. Sonigo P., Barker C.S., Hunter E., Wain-Hobson S. Nucleotide sequence of Mason-Pfizer monkey virus: An immunosuppressive D-type retrovirus. Cell. 1986;45:375–385. [PubMed: 2421920]
  31. Van Beveren C.P., Enami S., Curran T., Verma I.M. FBR murine osteosarcoma virus: Nucleotide sequence of the provirus reveals that the genome contains sequences acquired from two cellular genes. Virology. 1984;135:229–243. [PubMed: 6203215]
  32. Van Beveren C.P., van Straaten F., Galleshaw J.A., Verma I.M. Nucleotide sequence of the genome of a murine sarcoma virus. Cell. 1981;27:97–108. [PubMed: 6173134]
  33. Yu S.F., Baldwin D.N., Gwynn S.R., Yendapalli S., Linial M.L. Human foamy virus replication: A pathway distinct from that of retroviruses and hepadnaviruses. Science. 1996;271:1579–1582. [PubMed: 8599113]

Footnotes

*

For tables of complete genomes, vectors, and oncogene-containing viruses, see Tables 6, 7 and 8

*

HIV-2 and SIVmac were recently derived from SIVsmm.

Copyright © 1997, Cold Spring Harbor Laboratory Press.
Bookshelf ID: NBK19417

Views

  • PubReader
  • Print View
  • Cite this Page

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...