Deep Learning 23:dropout Understanding _ Reading Paper "Improving neural networks by preventing co-adaptation of feature detectors"

Last Update:2016-09-13 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

theoretical knowledge : Deep learning: 41 (Dropout simple understanding), in-depth learning (22) dropout shallow understanding and implementation, "improving neural networks by preventing Co-adaptation of feature detectors "

Feel there is nothing to say, should be said in the citation of the two blog has been made very clear, direct test it

Note :

1. During the testing phase of the model, the output of the hidden layer is obtained by using "mean network", in fact, the output value of the hidden layer node is halved (if the ratio of dropout is p=50%) before the network is propagated to the output layer for the following reasons:

At test time, we use the "mean Network", which contains all of the hidden units but with their outgoing weights halved to co Mpensate for the fact, twice as many of them are active.

That is, because the probability of neurons being activated is increased by twice times (because the number of neurons being activated is half as much), the weight of the neuron is reduced by half in order to compensate for this.

Of course, this compensation can be done at the time of training, the X will be magnified (divided by 1-p), or in the test, the weight is reduced (multiplied by the probability p).

2. Deep Learning: 41 (Dropout simple to understand) One thing is easy to misunderstand:

Deep Learning: 41 (Dropout simple comprehension) nn.dropoutfraction and depth learning in experiments (22) dropout shallow comprehension and implementation the level in the experiment refers to the probability that the neuron is dropout (that is, discarded), and the paper The probability p in "dropout:a simple means toprevent neural networks from overfitting" refers to the probability that neurons are present (i.e., not dropout). That is: p=1-dropoutfraction = Retain_prob = 1-level. Do not understand this, when looking at the code is easy to misunderstand deep learning: 41 (Dropout easy to understand) the code in the experiment and the paper "Dropout:a simple-to-prevent neural networks from Overfitting"is not the same, but in fact it is the same.

So, in the paper "Dropout:a simple-to-prevent neural networks from overfitting" has been noted:

After shielding some neurons from the above so that their activation value is 0, we also need to rescale the vector x1......x1000, that is, multiply by 1/(1-P). If you are in training, after 0, no x1......x1000 is rescale, then you need to rescale the weights when testing:

3. What is clearly stated in the paper is why it is in the code:

        %Dropout        if0)            if(nn.testing)                = nn.a{i}.* ( 1 - nn.dropoutfraction);             Else = (rand (Size (Nn.a{i})) >nn.dropoutfraction);                                  = nn.a{i}.*nn.dropoutmask{i};            End        End

That is, is the activation value good with P?

Answer: Because there are

It is known from the above formula that the W is multiplied by p to be equal to the Z superior with P.

4.Deep Learning: 41 (Dropout simple comprehension) in the experiment, the following code shows what D means:

Code in the inverse propagation function nnbp.m:

        if (nn.dropoutfraction>0)             = D{i}. * [Ones (Size (D{i},1),1) nn.dropoutmask{i}];        End

A: where d is the residual or error delta

advantages and disadvantages of 5.dropout :

For:

Pros: Can be used to prevent overfitting when training data is low

Disadvantage: The training time will be extended, but does not affect the test time

some MATLAB functions

Use RNG in 1.matlab to replace the popular interpretation of rand (' seed ', SD), Randn (' seed ', SD) and rand (' state ', SD)

Experiment

What I did was experiment was repeated deep learning: 41 (Dropout simple comprehension) in the experiment, the result is the same, specifically to see the blog post

Reference documents:

Dropout:a simple-to- prevent neural networks from overfitting [Paper] [BibTeX] [Code]

Imagenet classification with deep convolutional neural networks

Improving neural Networks with dropout

Deep Learning 23:dropout Understanding _ Reading Paper "Improving neural networks by preventing co-adaptation of feature detectors"

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Deep Learning 23:dropout Understanding _ Reading Paper "Improving neural networks by preventing co-adaptation of feature detectors"

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Deep Learning 23:dropout Understanding _ Reading Paper "Improving neural networks by preventing co-adaptation of feature detectors"

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support