Deep Learning (depth learning) Learning notes finishing Series (iv)

Last Update:2016-06-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Deep Learning (depth learning) Learning notes finishing Series

[Email protected]

Http://blog.csdn.net/zouxy09

Zouxy

Version 1.0 2013-04-08

Statement:

1) The Deep Learning Learning Series is a collection of information from the online very big Daniel and the machine learning experts selfless dedication. Please refer to the references for specific information. Specific version statements are also referenced in the original literature.

2) This article is for academic exchange only, non-commercial. So each part of the specific reference does not correspond in detail. If a division accidentally violated the interests of everyone, but also look haihan, and contact bloggers deleted.

3) I Caishuxueqian, finishing summary of the time is inevitable error, but also hope that the predecessors, thank you.

4) Reading this article requires machine learning, computer vision, neural network and so on (if not, it doesn't matter, no see, can read, hehe).

5) This is the first version, if there are errors, you need to continue to amend and delete. Also hope that we have a lot of advice. We all share a little, together for the promotion of the Motherland Scientific research (hehe, good noble goal ah). Please contact: [Email protected]

Directory:

I. Overview

Second, the background

III. visual mechanism of human brain

Iv. about Features

4.1, the granularity of the characteristic representation

4.2. Primary (shallow) feature representation

4.3, structural characteristics of the expression

4.4. How many features are needed?

The basic thought of deep learning

Vi. Shallow learning (shallow learning) and deep learning (Deepin learning)

Seven, deep learning and neural Network

Eight, deep learning training process

8.1. Training methods of traditional neural networks

8.2. Deep Learning Training Process

Common models or methods of deep learning

9.1, Autoencoder Automatic encoder

9.2, Sparse coding sparse coding

9.3. Restricted Boltzmann Machine (RBM) restricts the Boltzmann machines

9.4, deep Beliefnetworks convinced that the degree of network

9.5. Convolutional Neural Networks convolutional neural network

Ten, summary and Prospect

Xi. bibliography and deep Learning learning resources

Pick up

Common models or methods of deep learning

9.1, Autoencoder Automatic encoder

Deep learning the simplest way is to use the characteristics of artificial neural network, Artificial neural Network (ANN) itself is a hierarchical structure of the system, if given a neural network, we assume that its output is the same as the input, and then train to adjust its parameters, to get the weight in each layer. Naturally, we get a few different representations of input I (each layer represents a representation), and these representations are features. An automatic encoder is a neural network that can reproduce the input signal as much as possible. To achieve this, the automatic encoder must capture the most important factor that can represent the input data, like PCA, to find the main components that represent the original information.

The specific process is described as follows:

1) Given non-tagged data, use unsupervised learning to learn features:

In our previous neural network, as in the first diagram, the sample we entered was tagged, i.e. (input, target), so we changed the parameters of the previous layers according to the difference between the current output and the target (label) until it was convergent. But now we have no tag data, which is the graph on the right. So how does this error get?

For example, we will input a encoder encoder, we will get a code, this code is a representation of the input, then how do we know that this code means input? We add a decoder decoder, this time decoder will output a message, then if the output of this information and the input signal is very similar to the beginning (ideally the same), it is obvious that we have reason to believe that this code is reliable. So, by adjusting the parameters of the encoder and decoder to minimize the refactoring error, we get the first representation of the input signal, which is code code. Because there is no label data, the source of the error is directly reconstructed and compared to the original input.

2) Create a feature from the encoder and then train the next layer. This is done on a per-layer basis:

Then we get the first layer of code, and our refactoring error is minimal so we can believe that this code is a good expression of the original input signal, or that it is the same as the original signal (the expression is not the same, reflecting a thing). The second layer and the first layer of the training method is no different, we will be the first layer of the output code as the second layer of input signal, as well as minimize the reconstruction error, you will get the second layer of the parameter, and the second layer of input code, which is the original input information of the second expression. The other layers are cooked in the same way (training this layer, the parameters of the front layer are fixed, and their decoder is useless, no need).

3) supervised fine-tuning:

Through the above method, we can get a lot of layers. As to how many layers (or how much depth is needed, there is no scientific method of evaluation at this time) it needs to be tested. Each layer will have different representations of the original input. Of course, we think it is the more abstract the better, like the human visual system.

Here, the Autoencoder cannot be used to classify data, as it has not yet learned how to link an input and a class. It just learns how to refactor or reproduce its input. Or, it just learns to get a feature that can be well represented in the input, and this feature can represent the most important of the original input signal. So, in order to implement the classification, we can add a classifier (such as publication regression, SVM, etc.) to the topmost coding layer of autoencoder, and then train through the standard multi-layered neural network supervised training method (gradient descent method).

That is, at this point, we need to enter the last layer of the feature code into the last classifier, through a tagged sample, through supervised learning to fine-tune, this is also divided into two, one is only adjust the classifier (Black section):

Another: fine-tune the entire system with tagged samples: (This is best if there is enough data.) End-to-end learning End-to-end learning)

Once supervised training is completed, the network can be used for classification. The top layer of the neural network can be used as a linear classifier, and then we can replace it with a better-performing classifier.

In the study, it can be found that if the original features are added to the characteristics of the automatic learning can greatly improve the accuracy, even in the classification of the best classification algorithm than the current effect is better!

There are some variants of autoencoder, here is a brief introduction to the next two:

Sparse Autoencoder Sparse Automatic encoder:

Of course, we can continue to add some constraints to get the new deep learning method, such as: if Autoencoder based on the L1 regularity limit (L1 is mainly constrained in each layer of the nodes in most of the 0, only a few are not 0, This is the source of the sparse name, and we can get the sparse Autoencoder method.

As a matter of fact, limit the expression code to be as sparse as possible. Because sparse expression is often more effective than other expressions (the human brain seems to be, some of the input is just stimulating certain neurons, and most of the other neurons are suppressed).

denoising autoencoders Noise Reduction Automatic encoder:

Noise reduction Automatic encoder DA is based on the automatic encoder, training data to add noise, so the automatic encoder must learn to remove this noise to obtain a real non-noise contaminated input. Therefore, this forces the encoder to learn the more robust expression of the input signal, which is why its generalization ability is stronger than the general encoder. Da can be trained by gradient descent algorithm.

Continuation of

Deep Learning (depth learning) Learning notes finishing Series (iv)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Deep Learning (depth learning) Learning notes finishing Series (iv)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Deep Learning (depth learning) Learning notes finishing Series (iv)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support