Summary: Different Methods for Weight initialization in deep learning

Source: Internet
Author: User

Here we summarize the initialization method of three weights, the first two are more common, the latter is the newest one. In order to express the smooth (at that time to a crooked nut to see), in English, welcome to add and correct.

Respect the original, reproduced please specify: http://blog.csdn.net/tangwei2014


1. Gaussian


Weights is randomly drawn from Gaussian distributions with fixed mean (e.g., 0) and fixed standard deviation (e.g., 0.01) .

The most common initialization method in deep learning.


2. Xavier


This method proposes the adopt a properly scaled uniform or Gaussian distribution for initialization.

In Caffe (an openframework to deep learning) [2], It initializes the weights in network by drawing them from a Distributi On with zero mean and a specific variance,


Where W is the initialization distribution for the neuron in question, and n_in are the number of neurons feeding into I T. The distribution used is typically Gaussian or uniform.

in Glorot & Bengio ' s paper [1], itoriginally recommended using


Where N_out is the number of neurons, the result is the Federal Reserve to.

Reference:

[1] x. Glorot and Y. Bengio. Understanding the difficulty of training deepfeedforward neural networks. In international Conference on Artificial Intelligence and Statistics, pages 249–256, 2010.

[2] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S.guadarrama, and T. Darrell. Caffe:convolutional architecture for fast featureembedding. arxiv:1408.5093, 2014.


3. MSRA


This method was proposed to solve the training of extremely deep rectified models directly from scratch [1].

In this method,weights be initialized with a Zero-mean Gaussian distribution whose STD is


The Where is the spatial filter size in layer L and d_l?1 are the number of filters in layer l?1.

Reference:
[1]
kaiming He, Xiangyu Zhang, shaoqing Ren, and Jian Sun. Delving deep into rectifiers:surpassing human-level Perfo Rmance on ImageNet classification, Technical report, ARXIV, Feb. 2015

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Summary: Different Methods for Weight initialization in deep learning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.