Here we summarize the initialization method of three weights, the first two are more common, the latter is the newest one. In order to express the smooth (at that time to a crooked nut to see), in English, welcome to add and correct.
Respect the original, reproduced please specify: http://blog.csdn.net/tangwei2014
1. Gaussian
Weights is randomly drawn from Gaussian distributions with fixed mean (e.g., 0) and fixed standard deviation (e.g., 0.01) .
The most common initialization method in deep learning.
2. Xavier
This method proposes the adopt a properly scaled uniform or Gaussian distribution for initialization.
In Caffe (an openframework to deep learning) [2], It initializes the weights in network by drawing them from a Distributi On with zero mean and a specific variance,
Where W is the initialization distribution for the neuron in question, and n_in are the number of neurons feeding into I T. The distribution used is typically Gaussian or uniform.
in Glorot & Bengio ' s paper [1], itoriginally recommended using
Where N_out is the number of neurons, the result is the Federal Reserve to.
Reference:
[1] x. Glorot and Y. Bengio. Understanding the difficulty of training deepfeedforward neural networks. In international Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.
[2] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S.guadarrama, and T. Darrell. Caffe:convolutional architecture for fast featureembedding. arxiv:1408.5093, 2014.
3. MSRA
This method was proposed to solve the training of extremely deep rectified models directly from scratch [1].
In this method,weights be initialized with a Zero-mean Gaussian distribution whose STD is
The Where is the spatial filter size in layer L and d_l?1 are the number of filters in layer l?1.
Reference:
[1] kaiming He, Xiangyu Zhang, shaoqing Ren, and Jian Sun. Delving deep into rectifiers:surpassing human-level Perfo Rmance on ImageNet classification, Technical report, ARXIV, Feb. 2015
Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.
Summary: Different Methods for Weight initialization in deep learning