Manifold Coarse Graining for Online Semi-Supervised Learning
When the number of labeled data is not sufficient, Semi-Supervised Learning (SSL) methods utilize unlabeled data to enhance classification. Recently, many SSL methods have been developed based on the manifold assumption in a batch mode. However, when data arrive sequentially and in large quantities, both computation and storage limitations become a bottleneck. In this paper, we present a new semi-supervised coarse graining (CG) algorithm to reduce the required number of data points for preserving the manifold structure. First, an equivalent formulation of Label Propagation (LP) is derived. Then a novel spectral view of the Harmonic Solution (HS) is proposed. Finally an algorithm to reduce the number of data points while preserving the manifold structure is provided and a theoretical analysis on preservation of the LP properties is presented. Experimental results on real world datasets show that the proposed method outperforms the state of the art coarse graining algorithm in different settings.