Learn more: PMC Disclaimer | PMC Copyright Notice
Transcriptional and Posttranscriptional Regulation of HIV-1 Gene Expression
Abstract
Control of HIV-1 gene expression depends on two viral regulatory proteins, Tat and Rev. Tat stimulates transcription elongation by directing the cellular transcriptional elongation factor P-TEFb to nascent RNA polymerases. Rev is required for the transport from the nucleus to the cytoplasm of the unspliced and incompletely spliced mRNAs that encode the structural proteins of the virus. Molecular studies of both proteins have revealed how they interact with the cellular machinery to control transcription from the viral LTR and regulate the levels of spliced and unspliced mRNAs. The regulatory feedback mechanisms driven by HIV-1 Tat and Rev ensure that HIV-1 transcription proceeds through distinct phases. In cells that are not fully activated, limiting levels of Tat and Rev act as potent blocks to premature virus production.
HIV-1 gene expression is controlled by RNA-binding proteins Tat and Rev. They orchestrate complex interactions with the cellular transcription, RNA splicing, and RNA transport machinery, and are important targets for drug discovery.
After integration into the host genome, the HIV-1 provirus acts as a transcription template that is regulated at the transcriptional and posttranscriptional levels. Immediately after infection, HIV-1 produces only short completely spliced mRNAs encoding the viral regulatory proteins Tat and Rev. As the infection proceeds, transcription increases sharply, and larger, incompletely spliced mRNAs are produced. These encode Env and the HIV-1 accessory genes Vif, Vpr, and Vpu. Also synthesized late are the full-length unspliced transcripts which act both as the virion genomic RNA and the mRNA for the Gag-Pol polyprotein (Kim et al. 1989; Pomerantz et al. 1990).
This complex pattern of gene expression is controlled by the regulatory proteins Tat and Rev. Tat activates viral transcription by stimulating elongation from the viral long terminal repeat (LTR). Rev transports the unspliced and incompletely spliced mRNAs encoding the structural proteins from the nucleus to the cytoplasm. In this article, we review our current understanding of how these unique regulatory proteins orchestrate HIV-1 gene expression through their interactions with the cellular transcription, RNA splicing, and RNA transport machinery.
CONTROL OF HIV-1 TRANSCRIPTION BY Tat
Discovery of Transactivation by Tat
In HIV-1, as in all retroviruses, the LTR acts as the viral promoter. The first evidence that gene expression in HIV-1 also requires viral transacting factors came from experiments by Sodroski et al. (1985a,b) who noted that the expression of reporter genes placed under the control of the viral LTR was dependent on a transactivating factor, which they named Tat. Deletion analysis of the viral LTR showed that Tat activity required the transactivation-responsive region (TAR), a regulatory element located downstream from the initiation site for transcription between nucleotides +1 and +59 (Fig. 1A). It quickly became apparent that TAR was not a typical transcription element, because it is only functional when it is placed 3′ to the HIV-1 promoter, and in the correct orientation and position (Muesing et al. 1987). Genetic evidence that TAR functions as a transcribed RNA regulatory signal came from the observation that the TAR RNA sequence forms a highly stable, nuclease-resistant, stem-loop structure; mutations that destabilize the TAR RNA structure abolish Tat-stimulated transcription (Berkhout et al. 1989; Selby et al. 1989).
Tat and its interactions with P-TEFb. (A) Autoregulation of HIV-1 transcription by Tat. Tat binds to the TAR RNA element encoded in the HIV-1 leader sequence and recruits P-TEFb and other elongation factors to the transcription complex. Small changes in initiation efficiency, caused by epigenetic silencing or reductions in NF-κB levels in the cell, reduce Tat levels and inhibit transcription, driving the HIV-1 provirus into latency. Reinitiation by NF-κB stimulates Tat production and restores full transcription efficiency. Thus, positive feedback by Tat results in a bistable switch. (B) Recognition of TAR RNA by Tat and P-TEFb. The diagram on the left shows the bases in TAR that are recognized by Tat in the TAR bulge region and by CycT1 in the TAR loop region (red bases). The structures at right show the conformational changes induced by Tat binding (Aboul-ela et al. 1995). (C) Structure of the Tat:P-TEFb complex. Note that Tat folds on the outer surface of the CycT1 cyclin domain. The amino-terminal “activation” domain of Tat binds to the CDK9 T-loop, a region of the molecule that is essential for its enzymatic activity (Tahirov et al. 2010).
The Tat/TAR RNA Interaction
Dingwall et al. (1989, 1990) showed that Tat is able to specifically recognize TAR RNA and mapped its recognition site to a U-rich bulge near the apex of the TAR RNA stem (Fig. 1B). Detailed analysis of Tat’s interactions with TAR RNA by NMR subsequently revealed that Tat recognition of TAR requires conformational changes in the RNA structure (Fig. 1B). This refolding process involves displacement of the first residue in the bulge (U23) by one of the arginine side chains present in the basic binding domain of the Tat protein creating a binding pocket for the arginine side chain in the major groove together with the adjacent G26:C39 base pair (Puglisi et al. 1992; Aboul-ela et al. 1995; Brodsky and Williamson 1997; Davidson et al. 2009).
P-TEFb Is the Essential Cofactor for Tat
Although there is a strict correlation between the ability of TAR RNA to bind to Tat in vitro and the ability of these sequences to support transactivation (Churcher et al. 1993), mutations in the apical loop of the TAR element that do not interfere with Tat binding also interfere with transactivation (Feng and Holland 1988). To explain this apparent discrepancy Dingwall et al. (1990) postulated that a cellular cofactor interacts with the TAR RNA loop. This hypothesis raised further questions such as what is the role and function of the “loop factor” and what is the mechanism by which Tat stimulate gene expression after binding to TAR RNA.
The first direct evidence that Tat might be regulating HIV-1 transcriptional elongation, rather than transcriptional initiation, came from RNase protection experiments performed by Kao et al. (1987). They showed that in the absence of Tat, the majority of RNA polymerases initiating transcription stall near the promoter, whereas in the presence of Tat, there is a dramatic increase in the density of RNA polymerases found downstream from the promoter.
Rice and his colleagues (Herrmann and Rice 1995; Herrmann et al. 1996) showed that a protein kinase complex, which they called TAK (Tat-associated kinase), binds tightly and specifically to Tat. Subsequently, Zhu et al. (1997) cloned the kinase subunit of TAK. This turned out to be the CDK9 kinase, which is a component of a ubiquitous positive acting elongation factor pTEFb (Marshall and Price 1995; Marshall et al. 1996). In parallel, the search for the “loop factor” and other cofactors for Tat also pointed to P-TEFb as a critical cofactor for Tat activation of elongation. Wei et al. (1998) discovered that P-TEFb contains a cyclin component, CycT1, which can form a stable complex with CDK9, Tat, and TAR RNA (see online Movie 1 at www.perspectivesinmedicine.org). Crucially, for a putative “loop factor,” complex formation between Tat, P-TEFb, and TAR requires both the Tat binding site and the loop sequence.
Movie 1.
HIV Tat (violet) bound to Cyclin T1 (green) and CDK9 (yellow). Coordinate file 3MI9.
After these seminal biochemical observations, additional genetic and biochemical evidence showed unequivocally that P-TEFb is required for Tat-mediated transactivation. First, a set of novel CDK-9 protein kinase inhibitors were shown to be selective inhibitors of HIV-1 transcription (Mancebo et al. 1997). Second, persuasive genetic evidence showed that CycT1 is essential for Tat activity. Tat is inactive in murine cells, because the murine CycT1 sequence differs from the human sequence by a single substitution of cysteine 261 for tyrosine. Introduction of Y261 into the human CycT1 blocked HIV-1 transactivation in transfected cells whereas introduction of C261 into the murine CycT1 restored Tat-mediated transactivation (Bieniasz et al. 1998; Fujinaga et al. 1998; Garber et al. 1998; Kwak et al. 1999).
Finally, the crystal structure of a Tat:pTEFb complex was determined in 2010—the culmination of more than two decades of research on P-TEFb by David Price and his colleagues (Tahirov et al. 2010). The structure shows that Tat forms extensive contacts both with the CycT1 subunit of P-TEFb and also with the T-loop of the Cdk9 subunit (Fig. 1C).
Transactivation Mechanism
The binding of Tat to P-TEFb induces significant conformational changes in CDK9 that constitutively activate the enzyme (Wei et al. 1998; Isel and Karn 1999; Tahirov et al. 2010). As described in Figure 2A, the transactivation mechanism involves a complex set of phosphorylation events mediated by the Tat-activated P-TEFb that modify both positive and negative cellular elongation factors.
Transactivation mechanism. (A) NF-κB and Tat-activated transcription. Initiation is strongly induced by NF-κB, which acts primarily to remove chromatin restrictions near the promoter through recruitment of histone acetyltransferases. After the transcription through the TAR element, both NELF and the Tat/P-TEFb complex (including CDK9 and CycT1 and the accessory elongation factors including ELL2) are recruited to the elongation complex via binding interactions with TAR RNA. This activates the CDK9 kinase and leads to hyperphosphorylation of the CTD of RNA polymerase II, Spt5, and NELF-E. The phosphorylation of NELF-E leads to its release. The presence of hyperphosphorylated RNAP II and Spt5 allows enhanced transcription of the full HIV-1 genome. (B) Control of P-TEFb by 7SK and Tat. The majority of the P-TEFb in cells is found in a transcriptionally inactive snRNP complex containing 7SK RNA, HEXIM, and the RNA binding proteins MePCE and LARP7. Tat disrupts this complex by displacing HEXIM and forming a stable complex with P-TEFb. Prior to recruitment to the transcription complex, a larger complex is formed between P-TEFb and transcription elongation factors from the mixed lineage leukemia (MLL) family, including ELL2. (Figure is adapted from Karn 2011; reprinted, with permission, from Wolters Kluwer Health © 2011.)
In the absence of Tat, HIV-1 transcription elongation is highly restricted by the negative elongation factor NELF (Yamaguchi et al. 1999; Narita et al. 2003; Zhang et al. 2007). Phosphorylation of NELF-E by P-TEFb forces dissociation of NELF from TAR and releases paused transcription elongation complexes (Fujinaga et al. 2004). Significantly, the NELF-E subunit is able to bind directly to TAR RNA (Yamaguchi et al. 2002; Fujinaga et al. 2004) suggesting that NELF might be recruited to the HIV-1 provirus via its interactions with TAR.
Cell-free transcription studies have shown that Tat:P-TEFb also phosphorylates the RNAP II CTD during elongation (Isel and Karn 1999; Kim et al. 2002). This reaction creates a hyperphosphorylated form of the RNA polymerase that is highly enriched for phosphorylated Ser2 residues in the CTD (Ramanathan et al. 2001; Kim et al. 2002). In addition to targeting RNAP II, P-TEFb is also able to extensively phosphorylate Spt5, a subunit of the DRB sensitivity-inducing factor (DSIF), which carries a CTD homologous to the RNAP II CTD (Ivanov et al. 2000; Bourgeois et al. 2002). Although the unmodified DSIF inhibits elongation (Yamaguchi et al. 2002), phosphorylation of Spt5 separates it from the rest of the complex and converts it into a positive elongation factor that stabilizes transcription complexes at terminator sequences (Bourgeois et al. 2002; Yamada et al. 2006). Thus, Tat and P-TEFb are able to stimulate HIV-1 transcription both through the removal of blocks to elongation imposed by NELF and DSIF and by the enhancement of RNAP II processivity through the phosphorylation of Spt5 and the RNAP II CTD.
Our picture of how Tat and P-TEFb stimulate HIV-1 elongation has recently been refined by two proteomic studies that identified large protein complexes containing Tat P-TEFb and the human transcription factors/coactivators AFF4, ENL, AF9, and ELL2 (Fig. 2B) (He et al. 2010; Sobhian et al. 2010). One of these coactivators, ELL2, an elongation factor, which was previously shown to enhance transcription elongation by preventing RNAP II backtracking, is critical both for basal HIV-1 transcription and Tat-mediated transactivation. Thus, any model for the stimulatory effects of P-TEFb on HIV-1 transcription now has to take into account the role of ELL2 and possibly several additional elongation factors.
Regulation of P-TEFb
In actively replicating cells, such as HeLa cells and Jurkat T-cells, P-TEFb activity is tightly regulated and the majority of the enzyme is sequestered into a large inactive 7SK RNP complex comprising 7SK RNA and a series of RNA-binding proteins (Fig. 2B) (Nguyen et al. 2001; Yang et al. 2001). Essential components of the 7SK RNP complex include HEXIM1 or HEXIM2, which inhibit the CDK9 kinase in a 7SK-dependent manner (Yik et al. 2003; Michels et al. 2004), and the 7SK RNA binding proteins LARP-7 (He et al. 2008; Krueger et al. 2008), and BCDIN3 (Jeronimo et al. 2007). The sequestration of P-TEFb in the 7SK RNP complex effectively prevents any basal transcriptional activation by Tat-independent recruitment of P-TEFb to the provirus. Tat overcomes this barrier by disrupting the 7SK RNP complex by competing with HEXIM for CycT1 binding (Barboric et al. 2007; Sedore et al. 2007; Krueger et al. 2010). A recent study suggests that cyclin T1 acetylation also triggers dissociation of HEXIM1 and 7SK RNA from the inactive 7SK snRNP complex and activates the transcriptional activity of P-TEFb (Cho et al. 2009).
In contrast to Jurkat T-cells both primary resting central memory T-cells (Ramakrishnan et al. 2009) and primary monocytes (Sung and Rice 2009) show highly restricted levels of CycT1. Activation of P-TEFb in these cells therefore requires multiple steps involving both the initial assembly of the 7SK RNP complex and its relocalization to nuclear speckles where it becomes accessible to Tat and the rest of the transcription machinery.
The LTR as a Promoter
The HIV-1 LTR includes multiple upstream DNA regulatory elements that serve as binding sites for cellular transcription initiation factors (Rittner et al. 1995). The core promoter is a powerful and highly optimized promoter comprised of three tandem SP1 binding sites (Jones et al. 1986), an efficient TATA element (Garcia et al. 1989), and a highly active initiator sequence (Zenzie-Gregory et al. 1993). Each of these elements participates in the cooperative binding of the initiation factor TFIID and its associated TAF cofactors to the TATA element (Rittner et al. 1995). As a result, the HIV-1 LTR is an extremely efficient promoter that is capable of supporting even higher levels of transcription than the adenovirus major late promoter or the CMV immediate early promoter.
In addition to the core promoter, HIV-1 relies on an “enhancer region” that contains two NF-κB binding motifs (Nabel and Baltimore 1987) (see online Movie 2 at www.perspectivesinmedicine.org). Members of both the NF-κB family (Liu et al. 1992) and NFAT (Kinoshita et al. 1998) can bind to the HIV-1 NF-κB motifs. Because their recognition sequences overlap, binding of these factors is mutually exclusive (Chen-Park et al. 2002; Giffin et al. 2003). Binding of NF-κB is more efficient than NFAT because it is enhanced by cooperative interactions with Sp1 (Perkins et al. 1993). Although mutation of the NF-κB sites results in only a modest inhibition of virus growth in most transformed cell lines (Chen et al. 1997), signaling through the viral enhancer is essential to reactivate latent proviruses and support virus replication in primary T-cells, regardless of whether it is stimulated by NF-κB or by NFAT (Alcami et al. 1995; Bosque and Planelles 2008).
Epigenetic Regulation of HIV-1 Transcription
When HIV-1 infects cells, it preferentially integrates into active transcription units that provide a favorable environment for viral transcription (Lewinski et al. 2006). As originally shown by Verdin et al. (1993), proviruses assemble an ordered nucleosomal structure surrounding the promoter. These nucleosomal structures play a crucial role in establishing HIV-1 latency because epigenetic modifications of the provirus restrict transcription initiation (see Siliciano and Greene 2011). Typically, transcription from latent proviruses is restricted by high levels of histone deacetylases (HDACs), deacetylated histones, methylated histones, and DNA methylation (for reviews, see Margolis 2010; Karn 2011).
Control of HIV-1 Replication by Transcriptional Feedback
Because Tat functions as part of a positive regulatory circuit, conditions that restrict transcription initiation will in turn cause a reduction in Tat levels to below threshold levels and therefore result in dramatically reduced HIV-1 transcription and eventually entry into latency (for reviews, see Karn 2011; Siliciano and Greene 2011). Insightful studies by Weinberger et al. (Weinberger et al. 2005; Weinberger and Shenk 2006) and Burnett et al. (2009) have emphasized how stochastic fluctuations in Tat gene expression can act as a molecular switch. Small changes in initiation rates, which can be experimentally mimicked by introducing mutations into the NF-κB and Sp1 binding sites, are able to reduce Tat availability and disproportionately limit HIV-1 transcription, forcing viruses into latency (Burnett et al. 2009). However, the virus remains poised to resume its replication in response to triggers that stimulate transcription initiation and restore Tat levels. This switching mechanism crucially depends on the autoregulation of Tat; when Tat is expressed in trans from an ectopic promoter, HIV-1 proviruses become constitutively active and are unable to enter latency (Pearson et al. 2008).
CONTROL OF HIV-1 RNA SPLICING, POLYADENYLATION, EXPORT, AND TRANSLATION
Alternative Splicing of HIV-1 mRNA
To produce the full range of mRNAs needed to encode the viral proteins, HIV-1 primary transcripts undergo extensive and complex alternative splicing in the nucleus of infected cells (Fig. 3). Most HIV-1 strains use four different splice donor or 5′ splice sites (5′ss) and eight different acceptor or 3′ splice sites (3′ss) to produce more than 40 different spliced mRNA species in infected cells. These include several incompletely spliced bicistronic mRNA species, which encode both Env and Vpu; incompletely spliced mRNAs for Vif, Vpr, and a truncated 72 aa form of Tat; and completely spliced mRNAs that encode the HIV-1 regulatory proteins Tat, Rev, and Nef. In each of the spliced mRNAs, the 5′ss D1 (sometimes referred to as the “major splice donor”) is spliced to one of the 3′ss. As a result, all HIV-1 mRNAs include the highly structured noncoding exon 1 that extends from the 5′ cap to 5′ss D1.
Locations of splice sites, exons, and splicing elements in the HIV-1 genome. (Top) Schematic diagram of HIV-1 genome. The dark blue rectangles indicate open reading frames and are labeled with the gene names. The LTRs are shown at each edge of the genome: U3-gray, R-black, U5-light blue. Full-length RNA transcripts begin at the 5′-end of the R region of the 5′-LTR (left) and 3′processing and poly(A) addition takes place at the 3′-end of the R region in the 3′-LTR (right). (Middle) Locations of 5′ss (red bars) and 3′ss (black bars) in the HIV-1 genome. The location of the RRE is shown by the red rectangle. The exons present in the incompletely spliced ∼4-kb and ∼1.8-kb mRNA species corresponding to the HIV-1 genes are shown as cyan rectangles. Noncoding exon 1 is present in all spliced HIV-1 mRNA species. Either both or one of the small noncoding exons 2 and 3 shown are included in a fraction of the mRNA species. The exon compositions of the RNA species are also shown. RNA species designated by an “I” are incompletely spliced mRNA species. Brackets indicate that mRNA isoforms containing neither exon 2 nor 3, only exon 2 or 3, or both exons 2 and 3. The locations of the AUG codons used to initiate protein synthesis are shown as purple bars within the exons. (Bottom) Locations of the known splicing regulatory elements in HIV-1. Splicing enhancers are designated by green bars and splicing silencers are designated by red bars. (Figure is adapted from Stoltzfus 2009; reproduced, with permission, from Elsevier © 2009.)
Adding to the complexity of the mRNA species present in infected cells, a few viral mRNA isoforms are also produced by inclusion of exons flanked by 3′ss A1 and 5′ss D2 (exon 2) and/or the exon flanked by 3′ss A2 and 5′ss D3 (exon 3). Exons 2 and 3 do not contain initiator AUG codons and therefore are noncoding.
RNA splicing is performed while the pre-mRNA is associated with a large complex of cellular factors referred to as the spliceosome (for recent reviews, see Wang and Burge 2008; Chen and Manley 2009). The efficiency of early splicing complex formation is determined by the intrinsic strengths of the 3′ss and downstream 5′ss, and further regulated by a number of cis-acting elements (Fig. 3). Control of splicing in HIV-1 involves exonic splicing enhancers (ESEs) and intronic splicing enhancers (ISEs) which facilitate splice site recognition and are selectively bound by members of the SR (Ser-Arg) protein family. In addition, there are intronic and exonic splicing silencers (ISSs and ESSs, respectively) which repress splicing and are typically bound by specific members of the heterogeneous ribonuclear protein family (hnRNPs).
Analysis of HIV-1 mRNA species in virus-infected cells showed that there are striking differences in the relative abundances of the different viral mRNA species (Purcell and Martin 1993). In general, HIV-1 3′ splice sites are relatively inefficient in comparison to constitutive cellular 3′ splice sites (for reviews of HIV-1 splicing, see Stoltzfus and Madsen 2006; Stoltzfus 2009). However, the order of intrinsic splice site strengths (Asang et al. 2008) does not correlate with the observed levels of mRNAs spliced at these 3′ splice sites, implying that the cis-acting splicing elements dominate the splice site selection of HIV-1 mRNAs. For example, the first tat coding exon contains two ESS elements (ESS2 and ESS2p) that specifically repress splicing at 3′ss A3 and reduce the levels of both incompletely and completely spliced tat mRNA (Jacquenet et al. 2001; Amendt et al. 1994). Similarly, splicing at 3′ss A2 is repressed by an ESS element within exon 3 (ESSV), which results in relatively low levels of vpr mRNA (Bilodeau et al. 2001).
By contrast, splicing at the weak 3′ss A4c, A4a, A4b, and A5 sites is greatly facilitated by a guanosine-adenosine-rich ESE (GAR) within exon 5. GAR ESE activity is able to raise the levels of incompletely spliced env/vpu mRNAs and completely spliced nef and rev mRNAs to the point that they become the most abundant spliced mRNA species in the HIV-1-infected cell (Purcell and Martin 1993). The GAR ESE is selectively bound by several SR proteins but the most important player in the function of the element is SF2/ASF (Caputi et al. 2004). Splicing at the relatively weak 3′ss A1, which is required for high vif mRNA expression and inclusion of the noncoding exon 2, is facilitated by several different ESEs (ESE-Vif, ESE M1, and ESE M2) within exon 2 (Kammler et al. 2006; Exline et al. 2008). Mutations of the hnRNP A/B-dependent ESS, ESSV, activated 3′ss A2 and resulted in increased levels of mRNAs containing exon 3 and the incompletely spliced vpr mRNA. This excessive splicing phenotype resulted in a decrease in virus replication (Madsen and Stoltzfus 2005).
Formation of HIV-1 splicing complexes is also affected by the strengths of the various downstream 5′ss—an expected consequence of exon definition (Robberson et al. 1990; Hoffman and Grabowski 1992). Mutations of the nonconcensus D2 site that have enhanced affinity for U1 snRNP result in an excessive splicing phenotype characterized by increased inclusion of exon 2, increased levels of vif mRNA, reduced levels of unspliced viral RNA, and reduced virus production. Conversely, mutations of 5′ss D2 that decrease affinity for U1 snRNP result in decreased inclusion of exon 2 and decreased levels of vif mRNA and Vif protein (Madsen and Stoltzfus 2006; Exline et al. 2008; Mandal et al. 2008). Because of reduced levels of Vif, the replication of these virus mutants exhibit greater sensitivity than wild-type virus to inhibition by the cellular restriction factor APOBEC3G. The dramatic effects of these splicing element mutations show that maximum virus replication requires tight regulation of splicing to balance mRNA and genome RNA production.
HIV-1 Rev and the Control of RNA Export
Unspliced and incompletely spliced transcripts from cellular genes are typically degraded in the nucleus. To circumvent these surveillance mechanisms, HIV-1 and many other retroviruses, including the human T-cell leukemia viruses HTLV-1 and HTLV-II, express regulatory factors that facilitate the transport of intron-containing viral RNA out of the nucleus. The first of these factors to be discovered was the HIV-1 Rev protein which interacts with a highly structure RNA element in the env gene referred to as the Rev-responsive element (RRE) (Sodroski et al. 1986; Malim et al. 1989). Several other retroviruses, as originally shown for Mason-Pfizer monkey virus (MPMV), dispense with a protein factor and simply encode cis elements, referred to as constitutive transport elements or CTEs, that directly interact with cellular RNA export factors (for reviews of HIV-1 and MPMV RNA export, see Pollard and Malim 1998; Cullen 2003).
Initial studies showed that the ∼9-kb and ∼4-kb HIV-1 mRNA species, which encode the structural proteins Gag, Pol, and Env, require Rev for their transport and expression. On the other hand, the completely spliced ∼1.8-kb mRNAs, which encode Tat, Rev, and Nef, are exported to the cytoplasm in the absence of Rev by an endogenous cellular pathway used by cellular mRNAs. This division of transport mechanisms is achieved because the region of the HIV-1 env gene between 5′ss D4 and 3′ss A7 that contains the RRE is removed in the completely spliced mRNAs (Figs. 3 and and44).
Early and late phases of HIV-1 mRNA expression. Full-length unspliced ∼9-kb, incompletely spliced ∼4-kb mRNA, and completely spliced ∼1.8-kb mRNAs are expressed at both early and late times. (A) In the absence of Rev or when Rev is below the threshold necessary for it to function, the ∼9-kb and ∼4-kb mRNAs are confined to the nucleus and either spliced or degraded. Completely spliced ∼1.8-kb mRNAs are constitutively exported to the cytoplasm and translated to yield Rev, Tat, and Nef. (B) When the levels of Rev (shown as a pink oval) in the nucleus exceed the threshold necessary for function, the ∼9-kb and ∼4-kb mRNAs are exported to the cytoplasm and translated. The Rev-response element (RRE) is shown as a red rectangle. (Figure adapted from Pollard and Malim 1998; reprinted, with permission, from Annual Review of Microbiology © 1998.)
Rev-regulated transport requires Rev binding to the RRE. The RRE is an elongated stem-loop structure of 351 nt (Malim et al. 1989; Mann et al. 1994; Watts et al. 2009). Rev binds initially to a high affinity site located near the apex of the RRE structure (stem IIB) (Daly et al. 1989; Heaphy et al. 1990). NMR studies have shown that, as in the case of the Tat-TAR interaction, the Rev binding to the high affinity site induces a conformational change that results in the formation of two purine–purine non-Watson-Crick base pairs (Fig. 5A). This change in the structure of the RNA helix allows binding of the Rev ARD to the major groove (Battiste et al. 1996; Daugherty et al. 2008). Binding of Rev to the high affinity RRE site is then followed by binding of additional monomers to the complex (Malim and Cullen 1991; Zapp et al. 1991; Mann et al. 1994). The degree of oligomerization correlates with the ability of Rev to transport RNA (Fig. 5B,C) (Mann et al. 1994). Furthermore, oligomerization of Rev on the RRE is highly cooperative and results in an affinity approximately 500 times higher than Rev binding to the high affinity site alone (Daugherty et al. 2008).
Rev:RRE interactions and the Rev nuclear import/export cycle. (A) Rev binds to the RRE through its arginine-rich domain (ARD). In this model developed by Daugherty et al. (2010), the crystal structures of a Rev dimer are combined with the NMR structures of the Rev high affinity site. Notice the distortion of the RNA helix at the site of Rev binding. (B) Rev oligomerizes on the RRE and forms a complex with Crm1. The full-length RRE folds into an elongated RNA-stem loop structure with the high affinity binding site for Rev at the apex (Mann et al. 1994; Watts et al. 2009). (C) Model for the interactions between Rev and the nuclear export complex containing CRM-1 through the Rev nuclear export sequence (NES). The NES is an extended unstructured region emerging from one face of the Rev molecule. The core arginine-rich RNA binding domains interact with the RRE (Daugherty et al. 2010). (D) The Rev nuclear export cycle. Rev and the nuclear export complex containing CRM-1 interacts with nuclear pore proteins and is exported through nuclear pores to the cytoplasm. Once in the cytoplasm, Ran-GTP is converted to Ran-GDP, which is mediated by RanGAP and RanBP1. Crm1 is then transported back into the nucleus and Rev is released from the RRE. Importin-β binds to Rev through the nuclear localization signal in the ARD and interacts with Ran-GDP to facilitate import through the nuclear pore into the nucleus. In the nucleus, Ran-GDP is converted to Ran-GTP in the presence of RCC1. This releases Rev, which can begin another cycle of RRE-dependent Rev export. (Figure adapted from Pollard and Malim 1998; reprinted, with permission, from Annual Review of Microbiology © 1998.)
Transport of HIV-1 RNA out of the nucleus, as with most cellular proteins and RNAs, occurs via the nuclear pore complexes (NPC) (Fig. 5D) (for review see Kohler and Hurt 2007). Rev bound to the RRE interacts with the karyopherin family member Crm1 (also referred to as exportin 1) through an ∼10 amino acid leucine-rich export nuclear export signal (NES) near the Rev carboxyl terminus. Crm1, like other members of the karyopherin family, binds to cargo in the presence of the GTP-bound form of Ran GTPase. After export to the cytoplasm through the NPC, the bound GTP is hydrolyzed to GDP facilitated by the proteins RanGAP (Ran GTPase-activating protein) and RanBP1. This destabilizes the Rev complex and releases factors from the RRE (Fischer et al. 1995). Rev then reenters the nucleus by binding to the nuclear import factor, importin-β (Henderson and Percipalle 1997).
In the culmination of years of effort, recent crystallographic studies have led to the solution of both the amino-terminal structure of the Rev dimer (DiMattia et al. 2010) and an intact Rev dimer (Fig. 5A) (Daugherty et al. 2010). In the Rev dimer, the arginine-rich RNA-binding helices are located at the ends of a V-shaped assembly. This allows the dimer to bind adjacent RNA sites and structurally couples dimerization and RNA recognition. A second protein–protein interface permits Rev oligomers to act as an adaptor to the host export machinery, with viral RNA bound to one face and Crm1 to another. When excess Rev is present, a defined RNP complex of three dimers bound to the RRE is formed (Fig. 5C) (Daugherty et al. 2010).
Nuclear Retention of Unspliced and Incompletely Spliced mRNAs
Rev regulation requires accumulation of a pool of unspliced and incompletely spliced mRNAs in the nucleus. Nuclear retention is achieved both because cellular factors bind to unused HIV-1 3′ss and 5′ss (Chang and Sharp 1989; Borg et al. 1997) and because of cis-acting repressive sequences (CRSs) or instability sequences (INSs). These elements can be introduced into heterologous expression constructs and confer Rev-regulation to RNAs produced from these constructs (Schwartz et al. 1992; Najera et al. 1999). A number of different cellular RNA-binding proteins have been implicated in the retention mediated by CRS/INS elements including poly(A)-binding protein 1 (PABP1), heterogeneous ribonuclear protein A1 (hnRNP A1), and the heterodimer of two related proteins polypyrimidine tract binding protein-associated splicing factor (PSF) and p54(nrb) (Black et al. 1996; Afonina et al. 1997; Najera et al. 1999; Zolotukhin et al. 2003).
Guiding HIV-1 Transcripts through the Cytoplasm
In contrast to cellular mRNAs, the cytoplasmic fate of unspliced HIV-1 RNA appears to be strongly influenced by the choice of RNA export pathway. For example, HIV-1 Gag assembly in murine cells is normally very inefficient, however, altering the RNA nuclear export element used by HIV-1 gag-pol mRNA from the Rev response element to the constitutive transport element (CTE) restored both the trafficking of Gag to cellular membranes and efficient HIV-1 assembly (Swanson et al. 2004). Similarly, defective HIV-1 assembly occurred in human cells when export of gag-pol mRNA is dependent on the presence of the hepatitis B posttranscriptional element (PRE) (Jin et al. 2009).
3′ Processing and Polyadenylation of HIV-1 RNA
The 3′ processing and polyadenylation of metazoan pre-mRNAs involves recognition of the upstream AAUAAA and downstream GU-rich motifs surrounding the cleavage and poly(A) addition site. The AAUAAA signal is recognized by the cleavage/polyadenylation specificity factor (CPSF) the GU-rich motif is recognized by CstF. In addition, the cleavage reaction requires mammalian cleavage factors CF1m, CF2m, and poly(A) polymerase (for review, see Colgan and Manley 1997; Millevoi and Vagner 2010).
Like most retroviruses, HIV-1 contains a duplicated set of AAUAAA and GU-rich core elements at the ends of the R sequences found in both the 5′ and 3′ LTRs. HIV-1 uses multiple regulatory elements to direct processing to the 3′ LTR cleavage site. First, the HIV-1 U3 sequence, which is upstream of the 3′ processing signal, but not associated with the 5′ processing signal, contains upstream enhancer elements (USE) that act to facilitate binding of CPSF and enhance polyadenylation at the 3′ end of the HIV-1 transcripts (Gilmartin et al. 1995). Another USE element near the 5′ end of the Nef gene binds the cellular SR protein 9G8 which recruits the 3′ processing factor CF1m and CPSF (Valente et al. 2009). Second, the 5′ and 3′ LTR poly(A) processing sites are imbedded in a region of secondary structure called the poly(A) hairpin located in exon 1 immediately downstream from the TAR hairpin structure. Factors binding to sequences upstream of the AAUAAA site are believed to open up the poly(A) hairpin and allow preferential use of the 3′ LTR poly(A) processing site (Das et al. 1999). Finally, the splicing factor U1 snRNP acts to inhibit the 3′ processing and poly(A) site in the 5′ LTR by binding to the adjacent 5′ss D1. Mutations of 5′ss D1 that weaken binding of U1 snRNP allow the usage of the normally silent 5′ LTR poly(A) site (Ashe et al. 1995, 2000).
HIV-1 Translation Initiation
Initiation of translation of eukaryotic mRNAs involves scanning from the 5′ cap until an initiator AUG in an appropriate Kozak consensus sequence is recognized. Because the HIV-1 exon 1 contains multiple highly structured regions including the TAR sequence, the primer binding site, the poly(A) hairpin, and RNA packaging sequences, a typical ribosomal scanning mechanism for translational initiation is precluded. Furthermore, some of the HIV-1 mRNA UTRs contain AUG sequences upstream of the authentic initiator AUG that can interfere with translation initiation at the authentic AUG. Finally, as shown in Figure 1, all HIV-1 env mRNA species are bicistronic and have an upstream vpu open reading frame overlapping the downstream env open reading frame.
Several mechanisms have been proposed to circumvent these obstacles (for a recent review, see Bolinger and Boris-Lawrie 2009). HIV-1 may include an internal ribosome binding site (IRES), similar to those as found in picornaviruses, that permits recognition of the gag initiation codon. Additionally, HIV-1 and other retroviruses contain posttranscriptional elements (PCEs) that bind to cellular RNA binding proteins that can act as enhancers to facilitate translation initiation. Gag translation can be enhanced by RHA, a DEIH helicase (Bolinger et al. 2010) as well as the RNA binding proteins SRp40 and SRp55 (Swanson et al. 2010). An additional mechanism, which is used to bypass the vpu open reading frame and permit efficient translation at the downstream env AUG, involves 5′ cap-dependent ribosome shunting in which the scanning ribosome jumps over large regions of the mRNA before recognizing the correct initiation codon (Krummheuer et al. 2007).
HIV-1 Frameshifting
In common with all other retroviruses, HIV-1 has evolved a novel mechanism of programmed frameshifting in which specific sequence and structural signals in the mRNA can specify an mRNA reading frame change during translation (for review, see Brierley and Dos Ramos 2006; Bolinger and Boris-Lawrie 2009). In HIV-1, a −1 shift in the translational reading frame is required to shift from the Gag reading frame to the pro and pol reading frame. This frameshift occurs ∼5% of the time and results in the production of about one Gag-Pro-Pol precursor for every 20 Gag precursors synthesized. Two essential cis-acting sequence elements located ∼200 nt upstream of the Gag termination codon are required for frameshifting. The first is a hexanucleotide “slippery” sequence (UUUUUUA), which is the actual site of slippage during translation. The second is a stem-loop pseudoknot structure located just 3′ to the heptanucleotide sequence that acts as a pause site that increases the time the ribosome is associated with the slippery sequence. Pausing alone appears to be insufficient for frameshifting because addition of other roadblocks to translation are unable to support frameshifting (Brierley and Dos Ramos 2006).
CONCLUSIONS
As summarized above, gene expression in HIV-1 is controlled by the RNA-binding proteins Tat and Rev, which orchestrate complex interactions with the cellular transcription, RNA splicing, and RNA transport machinery. The mechanisms of action of both proteins, which were unprecedented at the time of their discovery, have now illuminated features of transcription elongation control, RNA splicing, and RNA export that were entirely unknown and unexpected.
Although the field has matured and expanded dramatically since the original discovery of Tat and Rev in 1985, an enormous amount still remains to be discovered about their structures and functions. On the structural side, complexes between TAR RNA and the intact P-TEFb molecule and large complexes containing P-TEFb and the RNA polymerase transcription elongation complex have yet to be tackled. The recent discovery of a whole family of additional elongation factors recruited associated with Tat certainly removes any complacency that all the cofactors required for Tat-mediated transactivation have been identified.
A particularly challenging problem is to understand the coupling that occurs between transcriptional elongation, the regulation of splicing, polyadenylation, and RNA export. Evidence that the splicing-associated c-Ski-interacting protein, SKIP, activates both Tat transactivation and HIV-1 splicing provides an intriguing insight into how these diverse events may be coordinated (Bres et al. 2005).
Understanding how the viral mRNP is remodeled during its journey from the nucleus to the cytoplasm will be essential for understanding both HIV-1 RNA translation and RNA packaging into virions. Remarkably, the export pathway used by the RNA affects viral assembly at the plasma membrane (Jin et al. 2009; Sherer et al. 2009), suggesting the site of translation influences subsequent protein function. In addition, there have been a growing number of host cell nuclear RNA binding proteins identified that bind to HIV-1 RNA or to Rev and may influence its export and cytoplasmic utilization (for reviews, see Cochrane et al. 2006; Cochrane 2009; Suhasini and Reddy 2009).
Finally, in addition to their academic interest, the HIV-1 regulatory proteins remain important targets for drug discovery, especially because Tat and Rev are both required for active viral replication and essential for the emergence of viruses from latency. Now that the structures of both proteins are known, and many of their cofactors have been identified, there is cause for optimism that renewed efforts to develop antiviral compounds will be successful.
Footnotes
Editors: Frederic D. Bushman, Gary J. Nabel, and Ronald Swanstrom
Additional Perspectives on HIV available at www.perspectivesinmedicine.org
REFERENCES
*Reference is also in this collection.





