Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding
This paper introduces a novel coding scheme based on the diffusion map framework. The idea is to run a t-step random walk on the data graph to capture the similarity of a data point to the codebook atoms. By doing this we exploit local similarities extracted from the data structure to obtain a global similarity which takes into account the non-linear structure of the data. Unlike the locality-based and sparse coding methods, the proposed coding varies smoothly with respect to the underlying manifold. We extend the above transductive approach to an inductive variant which is of great interest for large scale datasets. We also present a method for codebook generation by coarse graining the data graph with the aim of preserving random walks. Experiments on synthetic and real data sets demonstrate the superiority of the proposed coding scheme over the state-of-the-art coding techniques especially in a semi-supervised setting where the number of labeled data is small.