Deep Learning (depth learning) Learning notes finishing Series
[Email protected]
Http://blog.csdn.net/zouxy09
Zouxy
Version 1.0 2013-04-08
Statement:
1) The Deep Learning Learning Series is a collection of selfless contributions from the online very Daniel and machine learning experts. Please refer to the reference literature for more information. The detailed version number statement also participates in the original literature.
2) This article is for academic exchange only, non-commercial. Therefore, each part of the detailed reference information is not in detail corresponding. Suppose a division accidentally violated the interests of everyone, but also look haihan, and contact bloggers deleted.
3) I Caishuxueqian, finishing summary of the time is inevitable error, but also hope that the predecessors, thank you.
4) Reading this article requires machine learning, computer vision, neural network and so on (assuming no, it doesn't matter, not to see, can understand, hehe).
5) This is the first version number, if there are errors, you need to continue to amend and delete. Also hope that we have a lot of advice. We all share a little, together for the promotion of the Motherland Scientific research (hehe, good noble goal ah). Please contact: [Email protected]
Folder:
I. Overview
Second, the background
III. visual mechanism of human brain
Iv. about Features
4.1, the granularity of the characteristic representation
4.2, 0 Basic (shallow) feature representation
4.3, structural characteristics of the expression
4.4. How many characteristics do we need?
The basic thought of deep learning
Vi. Shallow learning (shallow learning) and deep learning (Deepin learning)
Seven, deep learning and neural Network
Eight, deep learning training process
8.1. Training methods of traditional neural networks
8.2. Deep Learning Training Process
The regular use model or method of the deep learning
9.1, Autoencoder self-Active encoder
9.2, Sparse coding sparse coding
9.3. Restricted Boltzmann Machine (RBM) restricts the Boltzmann machines
9.4, deep Beliefnetworks convinced that the degree of network
9.5. Convolutional Neural Networks convolutional neural network
Ten, summary and Prospect
Xi. references and deep learning learning resources
Pick up
9.2, Sparse coding sparse coding
Suppose we relax the output must be equal to the input, using the concept of the base in linear algebra at the same time, i.e. o = a1*φ1 + a2*φ2+....+ an*φn,φi is the base, AI is the coefficient, we can get an optimization problem:
Min | i–o|, where I is the input, O indicates the output.
By solving this optimization equation, we can obtain the coefficients AI and the base φi, which are the second approximate expression of the input.
Therefore, they can be used to express input I, the process is also the initiative to learn from themselves. Suppose we add the L1 regularity limit to the above equation to get:
Min | i–o| + u* (|a1| + | A2| + ... + | an |)
Such a method is called sparse Coding. In layman's words, a signal is expressed as a set of linear combinations of bases, and it is required that only a few bases are needed to represent the signal. "Sparsity" is defined as: There are very few non-0 elements or just very few elements that are far greater than 0. The requirement factor AI is sparse meaning that: for a set of input vectors, we just want to have as few coefficients as possible far greater than 0. The choice of using a sparse component to represent our input data is for a reason, because the vast majority of sensory data, such as natural images, can be represented as a superposition of a small number of basic elements that can be polygons or lines in an image. At the same time, the analogy with the 0 basic visual cortex has thus been improved (the human brain has a large number of neurons, but for some images or edges there is only very little neuron excitement, the others are in a suppressed state).
Sparse coding algorithm is a unsupervised learning method, which is used to find a set of "super-complete" base vectors to represent the sample data more efficiently. Although the form principal component Analysis (PCA) allows us to conveniently find a set of "complete" base vectors, what we want to do here is to find a set of "super-complete" base vectors to represent the input vectors (that is, the number of base vectors is greater than the number of dimensions of the input vectors). The advantage of super-complete base is that they can more effectively find out the structure and pattern implicit in the input data. However, for ultra-complete bases, the coefficient AI is no longer determined by the input vectors alone. Therefore, in the sparse coding algorithm, we add a criterion of "sparsity" to solve the degradation (degeneracy) problem caused by super-completeness. ( Please refer to the detailed procedure: UFLDL Tutorial Sparse coding )
For example, at the bottom of the feature extraction of the image to do the creation of edge detector, then the job here is images from natural randomly to select a few patches, through these patches to create a description of their "base ", that is, the right side of the 8*8=64 a basis composed of basis, and then given a test patch, we can follow the above formula through the linear combination of basis, and the sparse matrix is a, in a there are 64 dimensions, when the Central African 0 is only 3, So called "sparse".
There may be questions here, why the bottom as edge detector? What about the upper deck? Here is a simple explanation, we will be clear, the reason is edge detector is because the edge in different directions can describe the whole picture, so the edge of different directions is naturally the image of the basis ... And the result of the basis combination on the upper layer is the upper layer of the combination basis ... (That's what we said in the fourth part above)
The Sparse coding is divided into two parts:
1) Training stage: given a series of sample images [X1, x 2, ...], we need to learn to get a set of bases [Φ1,φ2, ...], which is the dictionary.
sparse coding is a variant of the K-means algorithm, and its training process is almost the same (the idea of the EM algorithm: Assuming that the objective function to be optimized consists of two variables, such as L (W, B), then we can first fix W, adjust B to make L minimum, then fix B, adjust W to make l minimum, such iterative alternating, Continuously pushes l to the minimum value. EM algorithm can see my blog: " from the maximum likelihood to the EM algorithm shallow solution ").
The training process is a iterative process, as stated above, our alternating changes of a and φ make the following objective function minimal.
Each iteration is divided into two steps:
A) fixed dictionary φ[k], then adjust a[k], so that the above, that is, the objective function is the smallest (that is, the solution lasso problem).
b) then fix a [K], adjust φ[k], so that the upper, that is, the objective function is the smallest (that is, the problem of convex qp).
Continue to iterate until it converges. This gives you a set of bases that are good at representing this series of x, which is the dictionary.
2) Coding stage: given a new picture x, the dictionary obtained by the above gets the sparse vector aby solving a lasso problem. This sparse vector is a sparse representation of this input vector x.
Like what:
Continuation of
Deep Learning (depth learning) Learning Notes finishing Series (v)