Dropout_layer.cpp (prevent over fitting)

Last Update:2018-07-26 Source: Internet

Author: User

Tags bool

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The role of the dropout layer is to prevent the training time from overfitting. In training, the traditional training method is that each iteration passes through a layer, all nodes are taken to participate in the update, training the entire network. To join the dropout layer, we only need to follow a certain probability (retaining probability) p to random sampling of the parameters of the weight layer, take the sampled node to participate in the update, the subnet as the target network for this update. The advantage of this is that because some nodes do not work at random, it is possible to avoid certain features that only take effect under a fixed combination, consciously allowing the network to learn some common commonalities (rather than some of the characteristics of some training samples) to improve the robustness of the trained model ...

Below is the note that I am looking at the dropout layer, if there is an error, please indicate ~ ~ ~

DROPOUT_LAYER.HPP::::

#ifndef caffe_dropout_layer_hpp_ #define CAFFE_DROPOUT_LAYER_HPP_ #include <vector> #include "caffe/blob.hpp" #

Include "Caffe/layer.hpp" #include "caffe/proto/caffe.pb.h" #include "caffe/layers/neuron_layer.hpp" namespace Caffe { /** * @brief During Training only, sets a random portion of @f$x@f$ to 0, adjusting * the rest of the vector mag
 Nitude accordingly. * * @param bottom input Blob vector (length 1) *-# @f$ (N \times C \times H \times W) @f$ * The inputs @f$ x @f $ * @param top output Blob vector (length 1) *-# @f$ (N \times C \times H \times W) @f$ * The computed outputs @f$ y = |x| @f$ *//*dropoutlayer class inherits Class Neuronlayer class */template <typename dtype> class Dropoutlayer:public Neuronlayer<dtyp    e> {public:/** * @param param provides dropoutparameter dropout_param, * with dropoutlayer options: *
   -Dropout_ratio (\b Optional, default 0.5). * Sets the probability @f$ p @f$ that any given unit is dropped. */* Constructor */explicit Dropoutlayer (const layerparameter& param): neuronlayer<dtype> (param) {}/* Set letter Number */virtual void Layersetup (const vector<blob<dtype>*>& Bottom, const vector<blob<dtype>*
  >& top); /* Memory allocation and input output data shape reshape function */virtual void reshape (const vector<blob<dtype>*>& bottom, const VECTOR&L T

  blob<dtype>*>& top);

 /* Returns the type of the current layer */virtual inline const char* type () const {return "dropout";}      Protected:/** * @param bottom input Blob vector (length 1) *-# @f$ (N \times C \times H \times W) @f$ *       The inputs @f$ x @f$ * @param top output Blob vector (length 1) *-# @f$ (N \times C \times H \times W) @f$ * The computed outputs. At training time, we have @f$ * y_{\mbox{train}} = \left\{* \begin{array}{ll} * \frac{x} {1-p} & \mbox{if} u > P \ \ 0 & \mbox{otherwise} * \end{arrAY} \right. * @f$, where @f$ u \sim u (0, 1) @f$ is generated independently for each * input on each iteration.
   At test time, we simply has * @f$ y_{\mbox{test}} = \mathbb{e}[y_{\mbox{train}}] = x @f$. *//*CPU forward propagation function */virtual void forward_cpu (const vector<blob<dtype>*>& Bottom, const VECTOR&LT;B
  lob<dtype>*>& top); /*gpu Forward Propagation function */virtual void Forward_gpu (const vector<blob<dtype>*>& Bottom, const VECTOR&LT;BLOB&LT
  ;D type>*>& top); /*CPU return propagation function */virtual void backward_cpu (const vector<blob<dtype>*>& top, const vector<bool>& Amp

  Propagate_down, const vector<blob<dtype>*>& bottom); /*GPU returns the Propagate function */virtual void Backward_gpu (const vector<blob<dtype>*>& top, const vector<bool>& Amp

  Propagate_down, const vector<blob<dtype>*>& bottom); When divided by Uint_max, the randomly generated values @f$u\sIm U (0,1) @f$/*blob type, a variable that holds the random number of Bernoulli two distributions */blob<unsigned int> rand_vec_;
  The probability @f$ p @f$ of dropping any input/* data is dropout (meaning that the iteration is not used for a training) */Dtype threshold_;
  Undropped inputs at train time @f$ 1/(1-p) @f$/*scale_ = = 1/(1-threshold_) */Dtype Scale_;
/* Not specifically used, do not know its absent */unsigned int uint_thres_;

};
 }//Namespace Caffe #endif//Caffe_dropout_layer_hpp_

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 8 5 86 87 88 89 90 91 92 93 94 95

Dropout_layer.cpp:::

TODO (Sergeyk): effect should not is dependent on phase.

Wasted memcpy. #include <vector> #include "caffe/layers/dropout_layer.hpp" #include "caffe/util/math_functions.hpp" namespace Caffe {/* Set dropout Layer object, first call Neuronlayer class to complete basic setup */template <typename dtype> void dropoutlayer<dtype>::
  Layersetup (const vector<blob<dtype>*>& Bottom, const vector<blob<dtype>*>& top) {
  Neuronlayer<dtype>::layersetup (bottom, top);
  /*protobuf file in the probability of the incoming dropout, that is, to remove the threshold_ probability of the data without */* because there is a random to put back the threshold_ probability of the data, then each data is removed the probability of threshold_*/
  Threshold_ = This->layer_param_.dropout_param (). Dropout_ratio ();
  Dcheck (Threshold_ > 0.);
  Dcheck (Threshold_ < 1.); /* (1.-threshold_) is the probability that this data is taken/*/Scale_ = 1.
  /(1.-Threshold_); Uint_thres_ = static_cast<unsigned int> (Uint_max * threshold_);/* does not seem to use the */}/* shape reshape and memory allocation, Similarly, first call the reshape function of the Neuronlayer class to complete the basic top and bottom data reshape*/template <typename dtype> void DropoutlayeR<dtype>::reshape (const vector<blob<dtype>*>& Bottom, const vector<blob<dtype>*>
  & top) {neuronlayer<dtype>::reshape (bottom, top); Set up the cache for random number generation//Reshapelike does do because rand_vec_ is of Dtype uint//This class A separate memory is allocated to store random number rand_vec_ that satisfy the Bernoulli distribution.
Reshape (Bottom[0]->shape ()); The forward propagation of the/*dropout layer, */template <typename dtype> void dropoutlayer<dtype>::forward_cpu (const VECTOR&LT;BLOB <Dtype>*>& Bottom, const vector<blob<dtype>*>& top) {Const dtype* bottom_data = Bottom
  [0]->cpu_data ();/* Previous layer of data memory address (input data) */dtype* Top_data = Top[0]->mutable_cpu_data ();/* Next layer of data memory address (output data) */ unsigned int* mask = Rand_vec_.mutable_cpu_data ();/* The memory address of the random number of Bernoulli distributions */const int count = Bottom[0]->count ();/* Input data blob The number */if (This->phase_ = = TRAIN) {/* is currently in the test phase *//Create random numbers Caffe_rng_bernoulli (count, 1.-Thresh Old_, mask);
   /* Generate Bernoulli random number */ for (int i = 0; i < count; ++i) {top_data[i] = bottom_data[i] * mask[i] * SCALE_;
  /* Iterates through the output values of each data under the Bernoulli distribution */}} else {caffe_copy (Bottom[0]->count (), Bottom_data, top_data);/* Each data is output during the test phase */ }}/*dropout the back propagation of the layer */template <typename dtype> void dropoutlayer<dtype>::backward_cpu (const vector< blob<dtype>*>& Top, const vector<bool>& Propagate_down,/* This vector records the current data for return propagation */const VECT or<blob<dtype>*>& bottom) {if (propagate_down[0]) {/* If in reverse propagation */Const dtype* Top_diff = top[0]-> Cpu_diff ();/* Next layer gradient (input data) */dtype* Bottom_diff = Bottom[0]->mutable_cpu_diff ();/* Front layer gradient (input data) */if (this->ph Ase_ = = TRAIN) {/* Training phase */Const unsigned int* mask = Rand_vec_.cpu_data ();/* Random number of Bernoulli distributions */const int count = Botto M[0]->count ();/* Input data BLOB number */for (int i = 0; i < count; ++i) {bottom_diff[i] = top_diff[i] * Mask[i] * scale_;/* return propagation gradient */}} else {Caffe_copy(Top[0]->count (), Top_diff, Bottom_diff);/* Copy data directly if not trained */}}} #ifdef cpu_only Stub_gpu (Dropoutlayer);
#endif Instantiate_class (Dropoutlayer);

Register_layer_class (dropout); }//Namespace Caffe

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More