Send to

Choose Destination
Antivir Ther. 2007;12(7):1097-106.

Predicting HIV coreceptor usage on the basis of genetic and clinical covariates.

Author information

Max Planck Institute for Informatics, Saarbrücken, Germany.



We compared several statistical learning methods for the prediction of HIV coreceptor use from clonal HIV third hypervariable (V3) loop sequences, and evaluated and improved their effectiveness on clinical samples.


Support vector machines (SVM), artificial neural networks, position-specific scoring matrices (PSSM) and mixtures of localized rules were estimated and tested using 10x ten-fold cross-validation on a clonal dataset consisting of 1,100 matched clonal genotype-phenotype pairs from 332 patients. Different SVMs were also trained and tested on a clinically derived dataset, representing 920 patient samples from British Columbia, Canada. Methods were evaluated using receiver operating characteristic (ROC) curves.


In the clonal analysis, the sensitivity of the 11/25 rule at 92.5% specificity was 59.5%. PSSMs and SVMs increased sensitivity to 71.9% and 76.4%, respectively, at the same specificity (P < < 0.05). In clinical samples, the sensitivity of the 11/25 rule and SVM decreased to 25.9% (specificity 93.9%) and 39.8% (specificity 93.5%), respectively. However, the integration of clinical data resulted in a further 2.4-fold increase in sensitivity over the 11/25 rule (63%). Univariate analyses identified 41 V3 mutations significantly associated with coreceptor usage.


For all methods tested, a substantial sensitivity decrease is observed on clinical data, probably owing to the heterogeneity of the viral population in vivo. In response to these complications, we present an SVM-based approach that integrates sequence information with clinical and host data, resulting in improved performance and sensitivity compared with purely sequence-based approaches.

[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center