Send to

Choose Destination
Medicine (Baltimore). 2015 Oct;94(42):e1865. doi: 10.1097/MD.0000000000001865.

A Generalized Entropy Measure of Within-Host Viral Diversity for Identifying Recent HIV-1 Infections.

Author information

From the Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA (JWW); Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA (OP-L, MP); and Department of Immunology and Infectious Disease, Harvard T.H. Chan School of Public Health, Boston, MA (VN).


There is a need for incidence assays that accurately estimate HIV incidence based on cross-sectional specimens. Viral diversity-based assays have shown promises but are not particularly accurate. We hypothesize that certain viral genetic regions are more predictive of recent infection than others and aim to improve assay accuracy by using classification algorithms that focus on highly informative regions (HIRs).We analyzed HIV gag sequences from a cohort in Botswana. Forty-two subjects newly infected by HIV-1 Subtype C were followed through 500 days post-seroconversion. Using sliding window analysis, we screened for genetic regions within gag that best differentiate recent versus chronic infections. We used both nonparametric and parametric approaches to evaluate the discriminatory abilities of sequence regions. Segmented Shannon Entropy measures of HIRs were aggregated to develop generalized entropy measures to improve prediction of recency. Using logistic regression as the basis for our classification algorithm, we evaluated the predictive power of these novel biomarkers and compared them with recently reported viral diversity measures using area under the curve (AUC) analysis.Change of diversity over time varied across different sequence regions within gag. We identified the top 50% of the most informative regions by both nonparametric and parametric approaches. In both cases, HIRs were in more variable regions of gag and less likely in the p24 coding region. Entropy measures based on HIRs outperformed previously reported viral-diversity-based biomarkers. These methods are better suited for population-level estimation of HIV recency.The patterns of diversification of certain regions within the gag gene are more predictive of recency of infection than others. We expect this result to apply in other HIV genetic regions as well. Focusing on these informative regions, our generalized entropy measure of viral diversity demonstrates the potential for improving accuracy when identifying recent HIV-1 infections.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for Wolters Kluwer Icon for PubMed Central
Loading ...
Support Center