Neural network and deep learning series article 15: Reverse propagation algorithm

Source: Internet
Author: User

Source: Michael Nielsen's "Neural Network and Deep learning", click the end of "read the original" To view the original English.

This section translator: Hit Scir undergraduate Wang Yuxuan

Disclaimer: If you want to reprint please contact [email protected], without authorization not reproduced.

    1. Using neural networks to recognize handwritten numbers

    2. How the inverse propagation algorithm works

      • Warm-up: A method of fast computing neural network output based on matrix

      • Two assumptions about the loss function

      • Hadamard product

      • Four basic equations behind the reverse propagation

      • Proof of four basic equations (selected readings)

      • Inverse propagation algorithm

      • Inverse Propagation Algorithm Code

      • Why is it that the reverse propagation algorithm is efficient

      • Reverse Propagation: Overall description

    3. Learning method of improving neural network

    4. Neural network can calculate visual proof of arbitrary function

    5. Why the training of deep neural networks is difficult

    6. Deep learning

The inverse propagation equation provides us with a method for calculating the cost function gradient. Let's write the algorithm explicitly:

    1. Input: calculates the corresponding activation function value for the input layer.

    2. forward propagation: for each, computed and.

    3. Output Error: calculates the vector.

    4. reverse propagation of errors: for each calculation

    5. output: The gradient of the cost function is the and

Through the above algorithm can see why it is called the reverse propagation algorithm. We start with the last layer and calculate the error vectors backwards. The inverse computational error in a neural network may seem strange. But if we recall the process of proving the reverse propagation, we will find that the process of reverse propagation results from the cost function, which is the function of the output value of the neural network. In order to understand how the cost function changes with the previous weights and offsets, we must repeatedly apply the chain rules and get useful expressions through inverse computations.

Practice
    • Reverse propagation after modifying a neuron
      Suppose we modify a neuron in the forward propagation network so that the output of the neuron is a function of a non-sigmoid function. How should we modify the inverse propagation algorithm in this case?

    • Reverse propagation of linear neurons

      Suppose we replace the usual nonlinear equations in the whole neural network. Re-write the reverse propagation algorithm in this case.

      As I have said above, the inverse propagation algorithm calculates the gradient of the cost function for each training sample. In the actual situation, the inverse propagation algorithm is often used in conjunction with learning algorithms such as random gradient descent, in which we need to calculate the gradient of a batch of training samples in the random gradient descent algorithm. Given a small batch (mini-batch) of training samples, the following algorithm gives the gradient descent learning steps based on these training samples:

    1. Enter a set of training samples

    2. for each training sample: set the appropriate input activation value and perform the following steps:

      1. forward propagation: for each, calculate and.

      2. Output Error: calculates the vector.

      3. reverse propagation of errors: on each, calculated.

    3. gradient descent: on each, respectively, according to the law and updating weights and offsets.

Of course, in order to achieve a random gradient drop in practice, you also need an external loop for generating a small batch (mini-batches) training sample and an external loop for stepwise calculation of each iteration. For brevity, these have been omitted.

In the next section we will introduce the "Inverse Propagation algorithm code", so stay tuned!

      • "Hit Scir" public number

      • Editorial office: Guo Jiang, Li Jiaqi, Xu June, Li Zhongyang, Hulin Lin

      • Editor of the issue: Li Zhongyang

Neural network and deep learning series article 15: Reverse propagation algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.