Paper note "ImageNet Classification with deep convolutional neural Network"

Source: Internet
Author: User

I. Summary

Learn about a must-read paper from CNN, something you can understand.

Second, the structure

1, Relu Benefits: 1, in training time, than Tanh and Sigmod faster, and BP time derivative is also very easy

2, because it is a non-saturated function, so there is basically no gradient disappears

Relu as long as the control of learing rate, can be said to win before the activation function, it can also help us to train a deeper network.

Now the further enhancement of the relu is still studied, and is interested to understand.

2, GPU parallel computing at that time is a very good idea, but the author of the trick is a bit ... Although the conclusion is obtained through cross-validation, I think it is not very significant for future research.

3, the local normalization should also be a trick, temporarily did not contact. I'll see you later.

4, overlapping pooling is also in later didn't how heard of.

Iii. reduction of over-fitting

1. Data increase

(1) Image transformation

This is a very good, very common and very practical method ...

The original image is a large figure A, want to narrow a short edge to 256 D to get B, and then in the center of B take 256*256 square picture to get C, and then randomly extract 224*224 on C as a training sample, and then in the combination of image level inverse increase the sample to achieve data gain. This gain method is 2048 times times the sample increase, allowing us to run a larger network.

(2) Adjust the RGB value

The specific idea is: To do PCA analysis of three channel, get the main component, make some jittter in the corresponding dimension, increase or decrease some random variable that obey Gaussian distribution, the standard deviation is 0.1, so we can get some similar and meaningful data.

2, dropout

This is also a pretty cool technology, through the activation of the neuron probability session, can achieve more than one model combine effect (because each time the structure is different, but also shared parameters), there is no need to spend too much time to train multiple networks.

Iv. thinking

There are a few questions that you can take out and think about.

1, two GPUs are basically the same environment, but the trained convolution cores are completely different, why?

2. Or the network structure of the problem, why this can ...

V. Summary

To tell the truth, after reading this paper did not learn too much, not this article is not a good, but too cool, so much so that most of the research on CNN has used the theory inside, so many have a sense of déjà vu, but as CNN's roll over, it is really worth reading!

Paper note "ImageNet Classification with deep convolutional neural Network"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.