Display Settings:

Format

Send to:

Choose Destination
See comment in PubMed Commons below
Neuroimage. 2012 Mar;60(1):59-70. doi: 10.1016/j.neuroimage.2011.11.066. Epub 2011 Dec 1.

Does feature selection improve classification accuracy? Impact of sample size and feature selection on classification using anatomical magnetic resonance images.

Collaborators (237)

Weiner M, Aisen P, Weiner M, Aisen P, Petersen R, Jack CR Jr, Jagust W, Trojanowki JQ, Beckett L, Green RC, Saykin AJ, Morris J, Liu E, Green RC, Montine T, Petersen R, Aisen P, Gamst A, Thomas RG, Donohue M, Walter S, Gessert D, Sather T, Beckett L, Harvey D, Gamst A, Donohue M, Kornak J, Jack CR Jr, Dale A, Bernstein M, Felmlee J, Fox N, Thompson P, Schuff N, Alexander G, DeCarli C, Jagust W, Bandy D, Koeppe RA, Foster N, Reiman EM, Chen K, Mathis C, Morris J, Cairns NJ, Taylor-Reinwald L, Trojanowki JQ, Shaw L, Lee VM, Korecka M, Toga AW, Crawford K, Neu S, Saykin AJ, Foroud TM, Potkin S, Shen L, Kachaturian Z, Frank R, Snyder PJ, Molchan S, Kaye J, Quinn J, Lind B, Dolen S, Schneider LS, Pawluczyk S, Spann BM, Brewer J, Vanderswag H, Heidebrink JL, Lord JL, Petersen R, Johnson K, Doody RS, Villanueva-Meyer J, Chowdhury M, Stern Y, Honig LS, Bell KL, Morris JC, Ances B, Carroll M, Leon S, Mintun MA, Schneider S, Marson D, Griffith R, Clark D, Grossman H, Mitsis E, Romirowsky A, deToledo-Morrell L, Shah RC, Duara R, Varon D, Roberts P, Albert M, Kielb S, Rusinek H, de Leon MJ, Glodzik L, Doraiswamy P, Petrella JR, Coleman R, Arnold SE, Karlawish JH, Wolk D, Smith CD, Jicha G, Hardy P, Lopez OL, Oakley M, Simpson DM, Porsteinsson AP, Goldstein BS, Martin K, Makino KM, Ismail M, Brand C, Mulnard RA, Thai G, Mc-Adams-Ortiz C, Diaz-Arrastia R, Martin-Cook K, DeVous M, Levey AI, Lah JJ, Cellar JS, Burns JM, Anderson HS, Swerdlow RH, Apostolova L, Lu PH, Bartzokis G, Silverman DH, Graff-Radford NR, Parfitt F, Johnson H, Farlow M, Herring S, Hake AM, van Dyck CH, Carson RE, MacAvoy MG, Chertkow H, Bergman H, Hosein C, Black S, Stefanovic B, Caldwell C, Hsiung GY, Feldman H, Assaly M, Kertesz A, Rogers J, Trost D, Bernick C, Munic D, Kerwin D, Mesulam MM, Lipowski K, Wu CK, Johnson N, Sadowsky C, Martinez W, Villena T, Turner RS, Johnson K, Reynolds B, Sperling RA, Johnson KA, Marshall G, Frey M, Rosen A, Tinklenberg J, Sabbagh M, Belden C, Jacobson S, Kowall N, Killiany R, Budson AE, Norbash A, Johnson PL, Obisesan TO, Wolday S, Bwayo SK, Lerner A, Hudson L, Ogrocki P, Fletcher E, Carmichael O, Olichney J, DeCarli C, Kittur S, Borrie M, Lee TY, Bartha R, Johnson S, Asthana S, Carlsson CM, Potkin SG, Preda A, Nguyen D, Tariot P, Fleisher A, Reeder S, Bates V, Capote H, Rainka M, Hendin BA, Scharre DW, Kataki M, Zimmerman EA, Celmins D, Brown AD, Pearlson GD, Blank K, Anderson K, Saykin AJ, Santulli RB, Schwartz ES, Sink KM, Williamson JD, Garg P, Watkins F, Ott BR, Querfurth H, Tremont G, Salloway S, Malloy P, Correia S, Rosen HJ, Miller BL, Mintzer J, Spicer K.

Author information

  • 1Section on Functional Imaging Methods, Laboratory of Brain and Cognition, NIMH, NIH, Bethesda, USA.

Abstract

There are growing numbers of studies using machine learning approaches to characterize patterns of anatomical difference discernible from neuroimaging data. The high-dimensionality of image data often raises a concern that feature selection is needed to obtain optimal accuracy. Among previous studies, mostly using fixed sample sizes, some show greater predictive accuracies with feature selection, whereas others do not. In this study, we compared four common feature selection methods. 1) Pre-selected region of interests (ROIs) that are based on prior knowledge. 2) Univariate t-test filtering. 3) Recursive feature elimination (RFE), and 4) t-test filtering constrained by ROIs. The predictive accuracies achieved from different sample sizes, with and without feature selection, were compared statistically. To demonstrate the effect, we used grey matter segmented from the T1-weighted anatomical scans collected by the Alzheimer's disease Neuroimaging Initiative (ADNI) as the input features to a linear support vector machine classifier. The objective was to characterize the patterns of difference between Alzheimer's disease (AD) patients and cognitively normal subjects, and also to characterize the difference between mild cognitive impairment (MCI) patients and normal subjects. In addition, we also compared the classification accuracies between MCI patients who converted to AD and MCI patients who did not convert within the period of 12 months. Predictive accuracies from two data-driven feature selection methods (t-test filtering and RFE) were no better than those achieved using whole brain data. We showed that we could achieve the most accurate characterizations by using prior knowledge of where to expect neurodegeneration (hippocampus and parahippocampal gyrus). Therefore, feature selection does improve the classification accuracies, but it depends on the method adopted. In general, larger sample sizes yielded higher accuracies with less advantage obtained by using knowledge from the existing literature.

Copyright © 2011 Elsevier Inc. All rights reserved.

Comment in

PMID:
22166797
[PubMed - indexed for MEDLINE]
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Elsevier Science
    Loading ...
    Write to the Help Desk