Hum Mutat. 2012 Dec;33(12):1708-18. doi: 10.1002/humu.22161. Epub 2012 Aug 3.
Use of support vector machines for disease risk prediction in genome-wide association studies: concerns and opportunities.
Mittag F,
Büchel F,
Saad M,
Jahn A,
Schulte C,
Bochdanovits Z,
Simón-Sánchez J,
Nalls MA,
Keller M,
Hernandez DG,
Gibbs JR,
Lesage S,
Brice A,
Heutink P,
Martinez M,
Wood NW,
Hardy J,
Singleton AB,
Zell A,
Gasser T,
Sharma M;
International Parkinson’s Disease Genomics Consortium.
Nalls MA, Plagnol V, Hernandez DG, Sharma M, Sheerin UM, Saad M, Simón-Sánchez J, Schulte C, Lesage S, Sveinbjörnsdóttir S, Arepalli S, Barker R, Ben-Shlomo Y, Berendse HW, Berg D, Bhatia K, de Bie RM, Biffi A, Bloem B, Bochdanovits Z, Bonin M, Bras JM, Brockmann K, Brooks J, Burn DJ, Charlesworth G, Chen H, Chinnery PF, Chong S, Clarke CE, Cookson MR, Cooper J, Corvol JC, Counsell C, Damier P, Dartigues JF, Deloukas P, Deuschl G, Dexter DT, van Dijk KD, Dillman A, Durif F, Dürr A, Edkins S, Evans JR, Foltynie T, Gao J, Gardner M, Gibbs J, Goate A, Gray E, Guerreiro R, Gústafsson Ó, Harris C, van Hilten JJ, Hofman A, Hollenbeck A, Holton J, Hu M, Huang X, Hershey MS, Huber H, Hudson G, Hunt SE, Huttenlocher J, Illig T, Jónsson PV, Lambert JC, Langford C, Lees A, Lichtner P, München HZ, Limousin P, Lopez G, Lorenz D, McNeill A, Moorby C, Moore M, Morris HR, Morrison KE, O'Sullivan SS, Pearson J, Perlmutter JS, Pétursson H, Pollak P, Potter S, Ravina B, Revesz T, Riess O, Rivadeneira F, Rizzu P, Ryten M, Sawcer S, Schapira A, Scheffer H, Shaw K, Sidransky E, Smith C, Spencer CC, Stefánsson H, Steinberg S, Stockton JD, Strange A, Talbot K, Tanner CM, Tashakkori-Ghanbaria A, Tison F, Trabzuni D, Traynor BJ, Uitterlinden AG, Velseboer D, Vidailhet M, Walker R, van de Warrenburg B, Wickremaratchi M, Williams N, Williams-Gray CH, Winder-Rhodes S, Stefánsson K, Martinez M, Hardy J, Heutink P, Brice A, Gasser T, Singleton AB, Wood NW.
Source
Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.
Abstract
The success of genome-wide association studies (GWAS) in deciphering the genetic architecture of complex diseases has fueled the expectations whether the individual risk can also be quantified based on the genetic architecture. So far, disease risk prediction based on top-validated single-nucleotide polymorphisms (SNPs) showed little predictive value. Here, we applied a support vector machine (SVM) to Parkinson disease (PD) and type 1 diabetes (T1D), to show that apart from magnitude of effect size of risk variants, heritability of the disease also plays an important role in disease risk prediction. Furthermore, we performed a simulation study to show the role of uncommon (frequency 1-5%) as well as rare variants (frequency <1%) in disease etiology of complex diseases. Using a cross-validation model, we were able to achieve predictions with an area under the receiver operating characteristic curve (AUC) of ~0.88 for T1D, highlighting the strong heritable component (∼90%). This is in contrast to PD, where we were unable to achieve a satisfactory prediction (AUC ~0.56; heritability ~38%). Our simulations showed that simultaneous inclusion of uncommon and rare variants in GWAS would eventually lead to feasible disease risk prediction for complex diseases such as PD. The used software is available at http://www.ra.cs.uni-tuebingen.de/software/MACLEAPS/.
© 2012 Wiley Periodicals, Inc.
- PMID:
- 22777693
- [PubMed - indexed for MEDLINE]
-
Publication Types
MeSH Terms
Grant Support
Full Text Sources
Miscellaneous