Accelerated low-rank representation for subspace clustering and semi-supervised classification on large-scale data

Jicong Fan; Zhaoyang Tian; Mingbo Zhao; Tommy W S Chow

doi:10.1016/j.neunet.2018.01.014

Accelerated low-rank representation for subspace clustering and semi-supervised classification on large-scale data

Neural Netw. 2018 Apr:100:39-48. doi: 10.1016/j.neunet.2018.01.014. Epub 2018 Feb 2.

Authors

Jicong Fan¹, Zhaoyang Tian², Mingbo Zhao³, Tommy W S Chow⁴

Affiliations

¹ Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Special Administrative Region. Electronic address: jicongfan2-c@my.cityu.edu.hk.
² Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Special Administrative Region. Electronic address: zytian3-c@my.cityu.edu.hk.
³ School of Information Science and Technology, Donghua University, Shanghai, PR China. Electronic address: mzhao4@dhu.edu.cn.
⁴ Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Special Administrative Region. Electronic address: eechow@cityu.edu.hk.

PMID: 29475014
DOI: 10.1016/j.neunet.2018.01.014

Abstract

The scalability of low-rank representation (LRR) to large-scale data is still a major research issue, because it is extremely time-consuming to solve singular value decomposition (SVD) in each optimization iteration especially for large matrices. Several methods were proposed to speed up LRR, but they are still computationally heavy, and the overall representation results were also found degenerated. In this paper, a novel method, called accelerated LRR (ALRR) is proposed for large-scale data. The proposed accelerated method integrates matrix factorization with nuclear-norm minimization to find a low-rank representation. In our proposed method, the large square matrix of representation coefficients is transformed into a significantly smaller square matrix, on which SVD can be efficiently implemented. The size of the transformed matrix is not related to the number of data points and the optimization of ALRR is linear with the number of data points. The proposed ALRR is convex, accurate, robust, and efficient for large-scale data. In this paper, ALRR is compared with state-of-the-art in subspace clustering and semi-supervised classification on real image datasets. The obtained results verify the effectiveness and superiority of the proposed ALRR method.

Keywords: Large-scale data; Low-rank representation; Matrix factorization; Nuclear norm; Semi-supervised classification; Subspace clustering.

MeSH terms

Algorithms
Artificial Intelligence / classification
Cluster Analysis
Learning
Pattern Recognition, Visual / classification*
Statistics as Topic / classification*
Supervised Machine Learning / classification*