A novel decision-tree method for structured continuous-label classification

IEEE Trans Cybern. 2013 Dec;43(6):1734-46. doi: 10.1109/TSMCB.2012.2229269.

Abstract

Structured continuous-label classification is a variety of classification in which the label is continuous in the data, but the goal is to classify data into classes that are a set of predefined ranges and can be organized in a hierarchy. In the hierarchy, the ranges at the lower levels are more specific and inherently more difficult to predict, whereas the ranges at the upper levels are less specific and inherently easier to predict. Therefore, both prediction specificity and prediction accuracy must be considered when building a decision tree (DT) from this kind of data. This paper proposes a novel classification algorithm for learning DT classifiers from data with structured continuous labels. This approach considers the distribution of labels throughout the hierarchical structure during the construction of trees without requiring discretization in the preprocessing stage. We compared the results of the proposed method with those of the C4.5 algorithm using eight real data sets. The empirical results indicate that the proposed method outperforms the C4.5 algorithm with regard to prediction accuracy, prediction specificity, and computational complexity.

MeSH terms

  • Algorithms*
  • Artificial Intelligence*
  • Data Mining / methods*
  • Databases, Factual*
  • Decision Support Techniques*
  • Documentation / methods*
  • Pattern Recognition, Automated / methods*