Learning Multi-Instance Deep Discriminative Patterns for Image Classification

IEEE Trans Image Process. 2017 Jul;26(7):3385-3396. doi: 10.1109/TIP.2016.2642781. Epub 2016 Dec 21.

Abstract

Finding an effective and efficient representation is very important for image classification. The most common approach is to extract a set of local descriptors, and then aggregate them into a high-dimensional, more semantic feature vector, like unsupervised bag-of-features and weakly supervised part-based models. The latter one is usually more discriminative than the former due to the use of information from image labels. In this paper, we propose a weakly supervised strategy that using multi-instance learning (MIL) to learn discriminative patterns for image representation. Specially, we extend traditional multi-instance methods to explicitly learn more than one patterns in positive class, and find the "most positive" instance for each pattern. Furthermore, as the positiveness of instance is treated as a continuous variable, we can use stochastic gradient decent to maximize the margin between different patterns meanwhile considering MIL constraints. To make the learned patterns more discriminative, local descriptors extracted by deep convolutional neural networks are chosen instead of hand-crafted descriptors. Some experimental results are reported on several widely used benchmarks (Action 40, Caltech 101, Scene 15, MIT-indoor, SUN 397), showing that our method can achieve very remarkable performance.