Format

Send to

Choose Destination
Pattern Recognit. 2019 Jun;90:232-249. doi: 10.1016/j.patcog.2019.01.036. Epub 2019 Jan 29.

A Random Forests Quantile Classifier for Class Imbalanced Data.

Author information

1
Division of Biostatistics, University of Miami, Miami, FL 33136, USA.

Abstract

Extending previous work on quantile classifiers (q-classifiers) we propose the q*-classifier for the class imbalance problem. The classifier assigns a sample to the minority class if the minority class conditional probability exceeds 0 < q* < 1, where q* equals the unconditional probability of observing a minority class sample. The motivation for q*-classification stems from a density-based approach and leads to the useful property that the q*-classifier maximizes the sum of the true positive and true negative rates. Moreover, because the procedure can be equivalently expressed as a cost-weighted Bayes classifier, it also minimizes weighted risk. Because of this dual optimization, the q*-classifier can achieve near zero risk in imbalance problems, while simultaneously optimizing true positive and true negative rates. We use random forests to apply q*-classification. This new method which we call RFQ is shown to outperform or is competitive with existing techniques with respect to tt-mean performance and variable selection. Extensions to the multiclass imbalanced setting are also considered.

KEYWORDS:

Class Imbalance; Minority Class; Random Forests; Response-based Sampling; Weighted Bayes Classifier

PMID:
30765897
PMCID:
PMC6370055
[Available on 2020-06-01]
DOI:
10.1016/j.patcog.2019.01.036

Supplemental Content

Loading ...
Support Center