Deep Learning 23:dropout Understanding _ Reading Paper "Improving neural networks by preventing co-adaptation of feature detectors"

Source: Internet
Author: User

theoretical knowledge : Deep learning: 41 (Dropout simple understanding), in-depth learning (22) dropout shallow understanding and implementation, "improving neural networks by preventing Co-adaptation of feature detectors "

Feel there is nothing to say, should be said in the citation of the two blog has been made very clear, direct test it

Note :

1. During the testing phase of the model, the output of the hidden layer is obtained by using "mean network", in fact, the output value of the hidden layer node is halved (if the ratio of dropout is p=50%) before the network is propagated to the output layer for the following reasons:

At test time, we use the "mean Network", which contains all of the hidden units but with their outgoing weights halved to co Mpensate for the fact, twice as many of them are active.

That is, because the probability of neurons being activated is increased by twice times (because the number of neurons being activated is half as much), the weight of the neuron is reduced by half in order to compensate for this.

Of course, this compensation can be done at the time of training, the X will be magnified (divided by 1-p), or in the test, the weight is reduced (multiplied by the probability p).

2. Deep Learning: 41 (Dropout simple to understand) One thing is easy to misunderstand:

Deep Learning: 41 (Dropout simple comprehension) nn.dropoutfraction and depth learning in experiments (22) dropout shallow comprehension and implementation the level in the experiment refers to the probability that the neuron is dropout (that is, discarded), and the paper The probability p in "dropout:a simple means toprevent neural networks from overfitting" refers to the probability that neurons are present (i.e., not dropout). That is: p=1-dropoutfraction = Retain_prob = 1-level. Do not understand this, when looking at the code is easy to misunderstand deep learning: 41 (Dropout easy to understand) the code in the experiment and the paper "Dropout:a simple-to-prevent neural networks from Overfitting"is not the same, but in fact it is the same.

So, in the paper "Dropout:a simple-to-prevent neural networks from overfitting" has been noted:

After shielding some neurons from the above so that their activation value is 0, we also need to rescale the vector x1......x1000, that is, multiply by 1/(1-P). If you are in training, after 0, no x1......x1000 is rescale, then you need to rescale the weights when testing:

3. What is clearly stated in the paper is why it is in the code:

        %Dropout        if0)            if(nn.testing)                = nn.a{i}.* ( 1 - nn.dropoutfraction);             Else = (rand (Size (Nn.a{i})) >nn.dropoutfraction);                                  = nn.a{i}.*nn.dropoutmask{i};            End        End

That is, is the activation value good with P?

Answer: Because there are

It is known from the above formula that the W is multiplied by p to be equal to the Z superior with P.

4.Deep Learning: 41 (Dropout simple comprehension) in the experiment, the following code shows what D means:

Code in the inverse propagation function nnbp.m:

        if (nn.dropoutfraction>0)             = D{i}. * [Ones (Size (D{i},1),1) nn.dropoutmask{i}];        End

A: where d is the residual or error delta

advantages and disadvantages of 5.dropout :

For:

Pros: Can be used to prevent overfitting when training data is low

Disadvantage: The training time will be extended, but does not affect the test time

some MATLAB functions

Use RNG in 1.matlab to replace the popular interpretation of rand (' seed ', SD), Randn (' seed ', SD) and rand (' state ', SD)

Experiment

What I did was experiment was repeated deep learning: 41 (Dropout simple comprehension) in the experiment, the result is the same, specifically to see the blog post

Reference documents:

Dropout:a simple-to- prevent neural networks from overfitting [Paper] [BibTeX] [Code]

Imagenet classification with deep convolutional neural networks

Improving neural Networks with dropout

Deep Learning 23:dropout Understanding _ Reading Paper "Improving neural networks by preventing co-adaptation of feature detectors"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.