• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of jvirolPermissionsJournals.ASM.orgJournalJV ArticleJournal InfoAuthorsReviewers
J Virol. Aug 2006; 80(15): 7765–7768.
PMCID: PMC1563729

Integration of Human Immunodeficiency Virus Type 1 in Untreated Infection Occurs Preferentially within Genes


Previous analyses of human immunodeficiency virus type 1 (HIV-1) integration sites generated in infections in vitro or in patients in whom viral replication was repressed by antiviral therapy have demonstrated a preference for integration within protein-coding genes. We analyzed integration sites in peripheral blood mononuclear cells (PBMCs), spleen, lymph node, and cerebral cortex from patients with untreated HIV-1 infections. The great majority of integration sites in each tissue were within genes. Statistical analyses of the frequencies of integration in genes in PBMCs and lymph tissue demonstrated a strong preference for integration within genes. Although the sample size for brain tissue was too small to demonstrate a clear statistical preference for integration in genes, four of the five integration sites identified in brain were within genes. Taken together, our data indicate that HIV-1 preferentially integrates within genes during untreated infection.

A defining step in the replication cycle of retroviruses is integration of the cDNA copy of the viral genome into the host cell chromosome. Several studies have used the recently completed sequence of the human genome to identify and characterize retroviral integration sites in infections carried out in vitro. These studies have demonstrated a preference for retroviral integration into actively transcribed regions of the genome (reviewed in references 3, 6, and 7). Integration by murine leukemia virus (MLV) occurs preferentially near transcriptional start sites, whereas that of human immunodeficiency virus type 1 (HIV-1) and simian immunodeficiency virus occurs preferentially anywhere within transcription units (9, 12, 14, 17, 20). Integration by avian sarcoma-leukosis virus (ASLV) shows a weak preference for integration into actively transcribed genes, with no preference for transcriptional start sites (14, 15). Analyses of nucleotide sequences surrounding retroviral integration sites revealed symmetry in base preferences for integration by HIV-1 and ASLV, but not MLV, indicating a distinct mechanism in the recognition of host cell DNA by different viral preintegration complexes (PICs) (10, 12). The implication of these studies is that host cell factors that associate with actively transcribed genes may modify chromatin structure or interact with PICs and facilitate integration nearby, and the precise molecular mechanisms of integration are likely to differ between retroviruses. In the case of HIV-1, an association of the viral integrase protein with the transcription factor LEDGF/p75 appears to be capable of directing integration into actively transcribed genes (5).

A potential limitation of the studies summarized above is that the integration sites examined were from in vitro infections. In the case of HIV-1, this is a significant issue, as the physiological state of CD4+ T lymphocytes that are productively infected can differ markedly between in vivo and in vitro conditions. Productive infection in vitro requires T-cell activation, and a number of impediments to the infectious cycle in resting CD4+ T cells, including blocks to reverse transcription and integration, have been documented (18, 21). In contrast, immunohistochemical and in situ hybridization analyses have revealed that considerable amounts of HIV-1 replication occur in vivo in CD4+ T lymphocytes that lack activation markers and therefore appear to be in a resting state (13). Thus, it is possible that the chromatin environment for HIV-1 PICs in vivo may differ from that found in vitro, and this may affect integration site selection.

A recent study examined HIV-1 integration in vivo in infected individuals in whom viremia was suppressed by highly active antiretroviral therapy (HAART) (8). Resting CD4+ T cells harboring transcriptionally silent proviruses were isolated from these patients, and identification of integrants revealed a strong preference for integration within transcription units of protein-coding genes. However, these integration sites were identified in individuals on HAART, and it is not clear how the pattern of integration under conditions of repressed viral replication relates to that generated during untreated infection.

To extend the analysis of HIV-1 integration sites generated in vivo, we examined integrants in tissues from individuals with untreated infections. We examined peripheral blood mononuclear cells (PBMCs) from a set of six individuals prior to antiviral therapy. Additionally, we examined the spleens, lymph nodes, and cerebral cortices of the brains of two deceased patients with no history of antiviral therapy.

PBMCs were obtained from HIV-1-infected patients prior to the initiation of HAART. Informed consent was obtained from these individuals in accordance with Baylor College of Medicine and UT-Houston institutional review boards. Brain cortices, lymph nodes, and spleens from HIV-1-infected donors with no history of antiviral treatment were obtained from the National Disease Research Interchange (Philadelphia, PA). Genomic DNA was isolated from PBMCs with a QIAamp DNA blood minikit (QIAGEN); genomic DNA was isolated from brains, lymph nodes, and spleens with a genomic DNA purification kit (Gentra Systems) according to the manufacturer's protocol.

Our procedure to identify integration sites is a modification of that previously described (8). Genomic DNA preparations were digested with PstI or combinations of SpeI, XbaI, and NheI (SpeI, XbaI, and NheI have compatible ends for ligation). The use of PstI and combinations of SpeI, XbaI, and NheI to clone integration sites reduces the bias that would be introduced with the use of a single restriction enzyme to cleave the flanking cellular DNA. Digested DNAs were serially diluted and self-ligated to generate templates for PCR. Circularized DNA was amplified with the primers LTR-outer (5′-TAACCAGAGAGACCCAGTACAGGC-3′) and Gag-outer (5′-GGTCAGCCAAAATTACCCTATAGTG-3′), followed by nested PCR with LTR-inner (5′-TGGTACTAGCTTGAAGCACCATCCA-3′) and Gag-inner (5′-TGTTAAAAGAGACCATCAATGAGGAAG-3′). Reactions used the Advantage 2 PCR system (Clontech), Advantage GC PCR system (Clontech), and Elongase mix (Invitrogen) and were performed at 94°C for 30 s, 55°C for 45 s, and 68°C for 150 s for 35 cycles. PCR products were examined by Southern blot hybridizations to confirm positive amplifications of HIV-1 sequences. Positive PCR products were ligated to the TA vector (Promega), and bacterial transformations were performed. Bacterial colonies were screened for HIV-1+ plasmid inserts by colony hybridization using a 32P probe to the viral long terminal repeat (LTR). Plasmids containing HIV-1 inserts were sequenced, and the junction of cellular and viral DNA was identified; all junctions analyzed contained cellular sequences precisely joined to the 5′ end of the viral 5′ LTR sequence (5′-TGGAA-3′). Cellular sequences were identified as unique sites by analysis at http://www.ncbi.nlm.nih.gov/BLAST/.

Integration sites in PBMCs.

We identified 23 HIV-1 integration sites in PBMCs isolated from six individuals prior to initiation of antiviral therapy. We were able to map 21 of these 23 integrants to unique sites in the human genome (Table (Table1).1). One integrant not mapped was located within a previously identified human BAC clone (CIT987SK-582J2) that we could not locate in the genome. One integrant cannot be mapped to a unique site in the genome because it is located within a conserved exon that is found in each of four related genes in a cluster on chromosome 18 that may have arisen by gene duplication (TCEB3C, TCEB3B, LOC653415, and LOC653420); nevertheless, this integrant was within a protein-coding gene. If we assume that the integrant not mapped to any site in the genome is located in an intergenic region, then 20 of the 23 integrants in PBMCs were within genes. The majority of integrants identified are likely to have been present in infected CD4+ T lymphocytes rather than monocytes or other cell types, as the majority of infected PBMCs are known to be CD4+ T cells (1, 4, 16). In a statistical analysis of the data in Table Table1,1, we treated integration events as a Bernoulli trial, assumed that integrations are independent events, and used the assumption that one-third of the human genome encodes protein genes (11, 19). Using the exact method for the one-proportion binomial test (by STATA), the locations of integration sites in PBMCs indicate that there is a highly significant preference for integration within genes (P value < 0.001; power, 1.000). There is no apparent preference for the orientation of integration, as 12 integrants are oriented in the same transcriptional direction as the host gene, while eight are oriented in the opposite direction. Examination of distance from the transcriptional start site of these genes revealed no positional bias for integration within the 5′ region of genes, such as occurs for MLV infections in vitro.

HIV integration sites in infected PBMCs

Integration sites in spleen and lymph nodes.

We examined integration sites in solid lymphoid tissues, lymph nodes, and spleens from two deceased HIV-infected patients with no history of antiretroviral therapy (Table (Table2).2). In tissues from patient A, we were able to identify and map 10 integration sites in infected lymph nodes and two in infected spleen tissue. Nine of the 10 integrants in lymph node were in genes, while both integrants in the spleen were in genes. In tissues from patient B, we identified and mapped three integration sites in lymph nodes, all of which were located in genes. A statistical analysis carried out as described above for PBMCs indicates that for patient A, there was a highly significant preference for integration within genes in lymphoid tissue (lymph node plus spleen), with a P value of <0.001 (power, 0.9998). If integrants in lymph nodes for patients A and B are considered together, there is also a highly significant preference for integration within genes, with a P value of <0.001 (power, 1.000). As with PBMCs, there is no preference in lymphoid tissues for orientation of integration, as integrants were split equally relative to the direction of transcription of cellular genes.

HIV-1 integration sites in infected lymphoid tissues

Integration sites in brain cerebral cortex.

We identified and mapped five integration sites in infected cerebral cortex from patients A and B (Table (Table3).3). Four of the five integrants in infected brain tissue were within genes. We carried out a statistical analysis as described above for these grouped integration sites, and the data suggest a preference for integration within genes, with a P value of 0.004 (power, 0.737). This statistical power is below the 0.8 threshold due to the small sample size, and therefore our data for brain tissue can only suggest a preference for integration within genes. This contrasts with our data for PBMCs and lymph tissues, which demonstrate a clear statistical preference for integration within genes.

HIV-1 integration sites in infected brains

Although only approximately one-third of the human genome contains protein-coding genes (11, 19), we observed a strong preference for HIV-1 integration sites in untreated infections in this coding portion of the genome in PBMCs and lymphoid tissues. Our data for infected brain tissue, although from a limited data set, also suggest a preference for integration within genes. Taken together, our data indicate a clear preference for HIV-1 integration within genes in untreated infections. This finding agrees with previous studies that examined infections in vitro and infections in patients undergoing antiviral therapy (8, 14, 17). Transcriptional profiles performed for in vitro infections have demonstrated that integration strongly favors actively transcribed genes (14, 17). We do not have direct information about the transcriptional activity of the integration sites identified in this study, but it is likely that the genes in which integrants were found were actively expressed in the cells infected in vivo. Although we do not know if the integration sites identified here are those of replication-competent viruses, evidence that most integrated proviruses in circulating CD4+ T lymphocytes are replication competent exists (2). In the case of PBMCs examined in this study, it is therefore likely that the majority of integration sites are those of replication-competent viruses. Finally, the demonstration that HIV-1 integration preferentially targets protein-coding genes suggests that the use of HIV vectors for gene therapy purposes must be viewed with caution, as insertion within genes whose disruption may result in pathology, such as tumor suppressors, appears likely.


We thank Claudia Kozinetz and Hung-Wen Yeh of the Design and Analysis Core of the Baylor-UT-Houston CFAR for statistical analysis.

This work was supported by NIH grants AI35381 (to A.P.R.), AI47725 (to J.T.K.), and P30AI036211 (Baylor-UT-Houston CFAR).


1. Blankson, J. N., D. Persaud, and R. F. Siliciano. 2002. The challenge of viral reservoirs in HIV-1 infection. Annu. Rev. Med. 53:557-593. [PubMed]
2. Brinchmann, J. E., J. Albert, and F. Vartdal. 1991. Few infected CD4+ T cells but a high proportion of replication-competent provirus copies in asymptomatic human immunodeficiency virus type 1 infection. J. Virol. 65:2019-2023. [PMC free article] [PubMed]
3. Bushman, F., M. Lewinski, A. Ciuffi, S. Barr, J. Leipzig, S. Hannenhalli, and C. Hoffmann. 2005. Genome-wide analysis of retroviral DNA integration. Nat. Rev. Microbiol. 3:848-858. [PubMed]
4. Chun, T. W., L. Carruth, D. Finzi, X. Shen, J. A. DiGiuseppe, H. Taylor, M. Hermankova, K. Chadwick, J. Margolick, T. C. Quinn, Y. H. Kuo, R. Brookmeyer, M. A. Zeiger, P. Barditch-Crovo, and R. F. Siliciano. 1997. Quantification of latent tissue reservoirs and total body viral load in HIV-1 infection. Nature 387:183-188. [PubMed]
5. Ciuffi, A., M. Llano, E. Poeschla, C. Hoffmann, J. Leipzig, P. Shinn, J. R. Ecker, and F. Bushman. 2005. A role for LEDGF/p75 in targeting HIV DNA integration. Nat. Med. 11:1287-1289. [PubMed]
6. Engelman, A. 2005. The ups and downs of gene expression and retroviral DNA integration. Proc. Natl. Acad. Sci. USA 102:1275-1276. [PMC free article] [PubMed]
7. Grandgenett, D. P. 2005. Symmetrical recognition of cellular DNA target sequences during retroviral integration. Proc. Natl. Acad. Sci. USA 102:5903-5904. [PMC free article] [PubMed]
8. Han, Y. F., K. Lassen, D. Monie, A. R. Sedaghat, S. Shimoji, X. Liu, T. C. Pierson, J. B. Margolick, R. F. Siliciano, and J. D. Siliciano. 2004. Resting CD4+ T cells from human immunodeficiency virus type 1 (HIV-1)-infected individuals carry integrated HIV-1 genomes within actively transcribed host genes. J. Virol. 78:6122-6133. [PMC free article] [PubMed]
9. Hematti, P., B. K. Hong, C. Ferguson, R. Adler, H. Hanawa, S. Sellers, I. E. Holt, C. E. Eckfeldt, Y. Sharma, M. Schmidt, C. von Kalle, D. A. Persons, E. M. Billings, C. M. Verfaillie, A. W. Nienhuis, T. G. Wolfsberg, C. E. Dunbar, and B. Calmels. 2004. Distinct genomic integration of MLV and SIV vectors in primate hematopoietic stem and progenitor cells. PLoS Biol. 2:e423. [PMC free article] [PubMed]
10. Holman, A. G., and J. M. Coffin. 2005. Symmetrical base preferences surrounding HIV-1 and avian sarcoma/leukosis virus but not murine leukemia virus integration sites. Proc. Natl. Acad. Sci. USA 102:6103-6107. [PMC free article] [PubMed]
11. Lander, E. S., L. M. Linton, B. Birren, C. Nusbaum, M. C. Zody, J. Baldwin, et al. 2001. Initial sequencing and analysis of the human genome. Nature 409:860-921. [PubMed]
12. Lewinski, M. K., D. Bisgrove, P. Shinn, H. Chen, C. Hoffmann, S. Hannenhalli, E. Verdin, C. C. Berry, J. R. Ecker, and F. D. Bushman. 2005. Genome-wide analysis of chromosomal features repressing human immunodeficiency virus transcription. J. Virol. 79:6610-6619. [PMC free article] [PubMed]
13. Li, Q. S., L. J. Duan, J. D. Estes, Z. M. Ma, T. Rourke, Y. C. Wang, C. Reilly, J. Carlis, C. J. Miller, and A. T. Haase. 2005. Peak SIV replication in resting memory CD4+ T cells depletes gut lamina propria CD4+ T cells. Nature 434:1148-1152. [PubMed]
14. Mitchell, R. S., B. F. Beitzel, A. R. W. Schroder, P. Shinn, H. M. Chen, C. C. Berry, J. R. Ecker, and F. D. Bushman. 2004. Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences. PLoS Biol. 2:e234. [PMC free article] [PubMed]
15. Narezkina, A., K. D. Taganov, S. Litwin, R. Stoyanova, J. Hayashi, C. Seeger, A. M. Skalka, and R. A. Katz. 2004. Genome-wide analyses of avian sarcoma virus integration sites. J. Virol. 78:11656-11663. [PMC free article] [PubMed]
16. Schnittman, S. M., M. C. Psallidopoulos, H. C. Lane, L. Thompson, M. Baseler, F. Massari, C. H. Fox, N. P. Salzman, and A. S. Fauci. 1989. The reservoir for HIV-1 in human peripheral blood is a T cell that maintains expression of CD4. Science 245:305-308. [PubMed]
17. Schroder, A. R., P. Shinn, H. Chen, C. Berry, J. R. Ecker, and F. Bushman. 2002. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110:521-529. [PubMed]
18. Stevenson, M., T. L. Stanwick, M. P. Dempsey, and C. A. Lamonica. 1990. HIV-1 replication is controlled at the level of T cell activation and proviral integration. EMBO J. 9:1551-1560. [PMC free article] [PubMed]
19. Venter, J. C., M. D. Adams, E. W. Myers, P. W. Li, R J. Mural, G. G. Sutton, et al. 2006. The sequence of the human genome. Science 291:1304-1351. [PubMed]
20. Wu, X. L., Y. Li, B. Crise, and S. M. Burgess. 2003. Transcription start regions in the human genome are favored targets for MLV integration. Science 300:1749-1751. [PubMed]
21. Zack, J. A., S. J. Arrigo, S. R. Weitsman, A. S. Go, A. Haislip, and I. S. Chen. 1990. HIV-1 entry into quiescent primary lymphocytes: molecular analysis reveals a labile, latent viral structure. Cell 61:213-222. [PubMed]

Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...