Notes on convolutional neural networks

Last Update:2016-07-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is a 06 year old article, but a lot of places are worth looking at.

I. Summary

The main points of CNN's Feedforward Pass and BackPropagation Pass, the key is the convolution layer and polling layer of the BP deduction explained.

Two, the classical BP algorithm

Forward propagation needs attention is the data normalization, the training data normalized to 0 mean and unit variance, can be improved in the gradient decline, because this can prevent premature satiety, mainly because of the early sigmoid and tanh as an activation function of the drawbacks (function is too large or too small, Gradients are very small), and so now with the Relu and batch normalization these two artifacts, basically to the gradient disappeared problem has a good solution. Then the BP algorithm, the paper is a little difficult to understand is the derivation of equation 5,

(quoted here for translation http://www.cnblogs.com/shouhuxianjian/p/4529202.html):

the "error" in the network where we need the back propagation can be thought of as "sensitivity" to each unit with biased disturbances. Other words:

(Equation 4)

Because, the bias sensitivity is actually equal to the error bias generated by all inputs of a unit. The following is the BP from high to Low:

(Equation 5)

because the left is the input x error bias, because, so to the front of the error bias on the first l+1 W on the multiplication of the activation function of the bias, this if the BP algorithm has a good understanding of the word should be well understood .

The "O" here means that it is multiplied by the original. For the error function in Equation 2, the sensitivity of the output layer neurons is as follows:

(Equation 6)

Finally, the delta-rule of the updated weights for a given neuron is to replicate the input portion of the neurons, only to scale with the delta of the neurons (in fact, it is two multiplied by equation 7 below). In the form of vectors, this corresponds to the outer product of the input vector (the output of the front layer) and the sensitivity vector:

(Equation 7)

(Equation 8)

Third, CNN

The advantage of sub-sampling is to reduce the computational time and gradually build more far-reaching space and configuration invariance, the latter point is very awkward, I understand the translation and scaling invariance of the confidence bar.

1. Calculate gradients

Here the paper is very unreasonable, direct pendulum formula, nothing ... The reverse propagation of convolutional layer and polling layer is still quite important, here is recommended a blog post http://www.cnblogs.com/tornadomeet/p/3468450.html, the inside of the CNN BP algorithm is very good, basically pushed through these four questions, CNN's BP algorithm has a certain understanding.

The more difficult to understand in the blog is this diagram:

That is, the convolution nucleus rotated 180 degrees, from left to right, then from top to bottom, calculate each value, and Feadforward pass convolution different, because that is actually a related operation ... It's just a convolution of the form, this is the orthodox convolution processing.

Here need to clarify the four problems, first of all the output error-sensitive items, this direct look at the deduction of the line, and then the next layer of the convolution layer for the pooling layer, the convolution layer of the error-sensitive items, because the reverse propagation when the output is smaller than the input, so the gradient at the time of transmission and traditional BP algorithm, So how to get the error-sensitive item of convolutional layer is the problem to consider. The third problem is to consider the pooling layer below the convolution layer, this is because we want to get the pooling layer error sensitivity, relying on the convolution core error sensitive, also because of the scale problem, so need to consider. The last problem is the convolution layer itself, after getting the error sensitivity of the output, how to get the W, this as long as the relevant operation can be obtained, the simple understanding is the L layer I and l+1 layer J between the weights equal to the l+1 layer J error Sensitive value multiplied by the L layer I input, and the convolution operation is an accumulation process, So when it comes to BP, it also needs to be related to the operation.

Iv. Combine

The thesis finally said several feature map fusion idea also very good understanding, my question is, this feature map is not always as multi-channel output, why need to merge, later many well-known net seems also did not use this method, is just a try?

Notes on convolutional neural networks

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Notes on convolutional neural networks

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Notes on convolutional neural networks

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support