Understanding Sparse Coding
Sparse Encoding Series:
- (i)----Spatial Pyramid Summary
- (ii) Sparse representation of----images summary of--SCSPM and LLC
- (iii)----Understanding sparse coding
- (iv)----sparse model and structural sparse model
---------------------------------------------------------------------------
The content of this article mainly comes from Kaiyu teacher in CVPR2012 on the tutorial. Before summing up SCSPM and LLC, I quoted a lot of pictures on tutorial. In fact, this tutorial feel good to write, so this time to describe it roughly in their own language. But sparse coding is the first two years compared to the fire of things, now the fire is deep learning.
1. What is sparse coding?
In the 1988, the concept of neuro-sparse coding was presented by Mitchison, formally quoted by Rolls of Oxford University. The Electrophysiological experiment Report of the visual cortex and the cat visual cortex of primates and the results of some related models show that the expression of complex stimuli in the visual cortex is based on the sparse coding principle. Studies have shown that the fourth layer of the V1 region of the primary visual cortex has 50 million (equivalent to the basis function), while the neurons responsible for visual perception of the retina and lateral geniculate body are only about 1 million (understood as the output neuron). It is indicated that sparse coding is an effective strategy for distributed expression of neural information groups. In 1996, Olshausen of the University of California, Berkeley, published a paper in Nature magazine, stating that the basic functions obtained by sparse coding of natural images are similar to those of simple cells in the V1 region (spatial locality, spatial directionality, information selectivity).
The typical sparse coding process is divided into training and testing.
Training: Given some training samples (Training samples) [X1, x2, ..., XM (in Rd)], learn the base of a dictionary (bases) [φ1,φ2 ...] (also in Rd)]. But with K-means and other unsupervised methods, can also be used to optimize the method (at this time training finished also got these training samples codes, which is a lasso and qp problem of the loop iteration);
Coding: The codes of the test sample is solved with an optimized method (the dictionary is now learned). The classic approach is to solve lasso:
(1)
Self-learning is to use a large number of non-annotated natural image training dictionaries when training, and then encode the image to get the feature codes.
2, Connections to RBMs, autoencoders
(1) Formula (classic sparse coding) has several features:
-the coefficient a is sparse;
The dimensions of--a are generally larger than the dimensions of x;
--The coding process a=f (x) is a nonlinear implicit function of x (i.e. we do not have an F (x) display representation, because the solution lasso has no analytic solution);
--The reconstruction process X ' =g (a) is a linear display of the function (x ' =σaiφi) about a.
The characteristics of RBM and self-coding are:
--there is a display of f (x);
--does not necessarily get sparse a, but if we add sparse constraints (such as sparse self-coding, sparse RBM), we usually get better results ( further explaining sparse helps learning).
Broadly speaking, the encoding method that satisfies so many conditions a=f (x) can be called sparse encoding:
1) A is sparse and usually has a higher dimension than x;
2) F (x) is a non-linear mapping; (jiang1st2010 NOTE: This clause requires doubt, see explanation below.) )
3) The process of Reconstruction X ' =g (a), so that the reconstructed X ' is similar to X.
Therefore, sparse rbm,sparse auto-encoder, even VQ can be regarded as a kind of sparse coding. (jiang1st2010 Note: The second requirement is that f (x) is a nonlinear mapping, but the VQ used in the SPM is a linear mapping, for the reasons can be found here and here. Kaiyu, who is also the author of the LLC paper, seems to have contradictions? But this is a small problem, there is no need to delve into it. )
3, Sparse activations vs. Sparse models
It is now possible to use A=F (x) to indicate the problem of sparse encoding. It can be decomposed into two situations:
1) The parameters of sparse model:f (x) are sparse
-for example: LASSO f (x) =<w,x>, where w requirements are sparse. (jiang1st2010 Note: F (x) is also linear in this example!) )
-This is a feature selection problem: All x selects the same subset of features.
--hot topic.
2) The output of sparse activation:f (x) is sparse
-meaning a is sparse.
-This is the problem of feature learning: Different x activates a subset of features that are not understood.
4, sparsity vs. locality
In fact, this question has been discussed here. Simply said that sparsity does not necessarily lead to locality, and locality is certainly sparse. Sparse is no better than locality, because locality has a smooth characteristic (that is, the adjacent X-encoded F (x) is also adjacent), and only sparse cannot guarantee smooth. The properties of the smooth are better for classification, and F (x) should be designed to ensure that similar x has a similar non-0 dimension in their codes.
Tutorial (1) shows the different λ, the effect of the various rendering in the dictionary:
The problem that the authors want to illustrate is that the better the classification works, the more clearly the basis will appear to belong to a particular class. But I didn't quite see it.
5, hierarchical sparse coding
Figure 3 illustrates here that the SIFT itself is a coding+pooling process, so the SPM is a two-layer coding+pooling. and hierarchical sparse coding is two layers of coding are sparse coding, such as:
The first layer of the entire HSC starts at the pixel level (no manual design of the SIFT feature), and after two layers of SC, the codes is formed. This process can be learned from unlabeled data, which is self-taught learning. Starting at the pixel level, this is much like DNN.
As a result, HSC will have a slightly better performance than SIFT+SC.
Tutorial at the end of the list of other topics on SC, I do not understand, here is no nonsense.
-----------------
jiang1st2010
Reprint Please specify source: http://blog.csdn.net/jwh_bupt/article/details/9902949
Understanding Sparse Coding