Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Bioinformatics. 2011 May 15;27(10):1384-9. doi: 10.1093/bioinformatics/btr159. Epub 2011 Mar 30.

High-dimensional pharmacogenetic prediction of a continuous trait using machine learning techniques with application to warfarin dose prediction in African Americans.

Author information

  • 1Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL 35294, USA.

Abstract

MOTIVATION:

With complex traits and diseases having potential genetic contributions of thousands of genetic factors, and with current genotyping arrays consisting of millions of single nucleotide polymorphisms (SNPs), powerful high-dimensional statistical techniques are needed to comprehensively model the genetic variance. Machine learning techniques have many advantages including lack of parametric assumptions, and high power and flexibility.

RESULTS:

We have applied three machine learning approaches: Random Forest Regression (RFR), Boosted Regression Tree (BRT) and Support Vector Regression (SVR) to the prediction of warfarin maintenance dose in a cohort of African Americans. We have developed a multi-step approach that selects SNPs, builds prediction models with different subsets of selected SNPs along with known associated genetic and environmental variables and tests the discovered models in a cross-validation framework. Preliminary results indicate that our modeling approach gives much higher accuracy than previous models for warfarin dose prediction. A model size of 200 SNPs (in addition to the known genetic and environmental variables) gives the best accuracy. The R(2) between the predicted and actual square root of warfarin dose in this model was on average 66.4% for RFR, 57.8% for SVR and 56.9% for BRT. Thus RFR had the best accuracy, but all three techniques achieved better performance than the current published R(2) of 43% in a sample of mixed ethnicity, and 27% in an African American sample. In summary, machine learning approaches for high-dimensional pharmacogenetic prediction, and for prediction of clinical continuous traits of interest, hold great promise and warrant further research.

PMID:
21450715
[PubMed - indexed for MEDLINE]
PMCID:
PMC3087957
Free PMC Article

Images from this publication.See all images (2)Free text

Fig. 1.
Fig. 2.
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for HighWire Icon for PubMed Central
    Loading ...
    Write to the Help Desk