Neural Network and Deeplearning (5.1) Why deep neural networks are difficult to train

Source: Internet
Author: User

In the deep network, the learning speed of different layers varies greatly. For example: In the back layer of the network learning situation is very good, the front layer often in the training of the stagnation, basically do not study. In the opposite case, the front layer learns well and the back layer stops learning.

This is because the gradient descent-based learning algorithm inherently has inherent instability, which causes the learning of the front or back layer to stop.

Vanishing gradient problem (the vanishing gradient problem)

In some deep neural networks, the gradient tends to be smaller when the hidden layer is propagated backward, that is, the learning speed of the hidden layer is slower than the hidden layer behind . This is the problem of vanishing gradients .

In the other case, the gradient of the hidden layer in front of you will become very large, that is, the previous hidden layer learns faster than the hidden layer behind it. This is called the problem of the gradient of the explosion .

In other words, gradients in deep neural networks are unstable, either disappearing in the front layer or exploding.

Causes of unstable gradient problems

The gradient on the front layer is the product of the items from the back layer, and when there are too many layers, there is an inherently unstable scene.

It is generally found that the gradient of the front layer in the sigmoid network disappears exponentially.

Neural Network and Deeplearning (5.1) Why deep neural networks are difficult to train

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.