Feature Interaction for Streaming Feature Selection

IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4691-4702. doi: 10.1109/TNNLS.2020.3025922. Epub 2021 Oct 5.

Abstract

Traditional feature selection methods assume that all data instances and features are known before learning. However, it is not the case in many real-world applications that we are more likely faced with data streams or feature streams or both. Feature streams are defined as features that flow in one by one over time, whereas the number of training examples remains fixed. Existing streaming feature selection methods focus on removing irrelevant and redundant features and selecting the most relevant features, but they ignore the interaction between features. A feature might have little correlation with the target concept by itself, but, when it is combined with some other features, they can be strongly correlated with the target concept. In other words, the interactive features contribute to the target concept as an integer greater than the sum of individuals. Nevertheless, most of the existing streaming feature selection methods treat features individually, but it is necessary to consider the interaction between features. In this article, we focus on the problem of feature interaction in feature streams and propose a new streaming feature selection method that can select features to interact with each other, named Streaming Feature Selection considering Feature Interaction (SFS-FI). With the formal definition of feature interaction, we design a new metric named interaction gain that can measure the interaction degree between the new arriving feature and the selected feature subset. Besides, we analyzed and demonstrated the relationship between feature relevance and feature interaction. Extensive experiments conducted on 14 real-world microarray data sets indicate the efficiency of our new method.

Publication types

  • Research Support, Non-U.S. Gov't