On the understanding of residual network ResNet

Last Update:2018-07-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Deep residual Learning for Image recognition this paper is famous

After reading the views of everyone http://www.jianshu.com/p/e58437f39f65, also want to talk about their reading after the understanding

Network depth is a major factor affecting the performance of deep convolution neural networks, but the researchers found that when the network deepened, the results of the training were not good. This is not because of the fitting, because the fitting words should be the result of the training set, the test set is not good, but the depth of the network phenomenon is the training set on the effect is not good. And the phenomenon will become worse with depth. This is not logical, because a deep network can be trained by adding an identity transformation to a function on a shallow network. And the deep network obviously does not learn this identity transformation. Therefore, ResNet was proposed.

Network structure is a lot of block composition, each block is composed of the following figure, add a shortcut connections from the function is to add an identity transformation.

From the forward propagation point of view, the introduction of the identity transformation can make the network parameters to adjust the role of greater. This place is quoted under a particularly good answer (HTTP://WWW.JIANSHU.COM/P/E58437F39F65)

"F is the network map before summation, H is the network mapping from input to summation." For example, to map 5 to 5.1, then the introduction of residuals is F ' (5) = 5.1, after the introduction of residuals is H (5) =5.1, H (5) =f (5) +5, F (5) = 0.1. Here the F ' and F both represent network parameter mappings, and the mapping of residuals is more sensitive to the change of output . For example, s output from 5.1 to 5.2, mapping f ' output increased 1/51=2%, and for residual structure output from 5.1 to 5.2, map f is from 0.1 to 0.2, increased by 100%. Obviously the latter output change to the weight adjustment effect is bigger, therefore the effect is better. The idea of residuals is to remove the same main part, so as to highlight small changes, see residual network My first reaction is the differential amplifier.

I think the answer to this friend is very vivid.

So what is more sensitive, I think from the reverse of the spread of the phenomenon is "gradient disappear to solve." Gradient is used to update weight parameters to make the network fit better, with error term, and the error term is actually the sensitivity of network loss value (I understand). So, added a short connections from the reverse propagation, to the error term to a direct to the front layer of the propagation and add, to alleviate the gradient reduction problem. thereby resolving the gradient disappears.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

On the understanding of residual network ResNet

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

On the understanding of residual network ResNet

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support