Feature Selection Method Based on Partial Least Squares and Analysis of Traditional Chinese Medicine Data

Comput Math Methods Med. 2019 Jul 1:2019:9580126. doi: 10.1155/2019/9580126. eCollection 2019.

Abstract

The partial least squares method has many advantages in multivariable linear regression, but it does not include the function of feature selection. This method cannot screen for the best feature subset (referred to in this study as the "Gold Standard") or optimize the model, although contrarily using the L1 norm can achieve the sparse representation of parameters, leading to feature selection. In this study, a feature selection method based on partial least squares is proposed. In the new method, exploiting partial least squares allows extraction of the latent variables required for performing multivariable linear regression, and this method applies the L1 regular term constraint to the sum of the absolute values of the regression coefficients. This technique is then combined with the coordinate descent method to perform multiple iterations to select a better feature subset. Analyzing traditional Chinese medicine data and University of California, Irvine (UCI), datasets with the model, the experimental results show that the feature selection method based on partial least squares exhibits preferable adaptability for traditional Chinese medicine data and UCI datasets.

MeSH terms

  • Algorithms
  • Animals
  • Blood Flow Velocity
  • Breast Neoplasms / epidemiology
  • Databases, Factual
  • Erythrocytes / cytology
  • Female
  • Humans
  • Least-Squares Analysis*
  • Linear Models
  • Machine Learning
  • Medicine, Chinese Traditional / statistics & numerical data*
  • Models, Statistical
  • Multivariate Analysis*
  • Rats
  • Regression Analysis
  • Rheum / metabolism*
  • Shock, Cardiogenic / therapy