A detailed explanation of BP neural network derivation process

Source: Internet
Author: User

BP algorithm is one of the most effective multi-layer neural network learning methods, its main characteristic is the signal forward transmission, and the error after the propagation, through the constant adjustment of the network weight value, so that the final output of the network and the desired output as close as possible to achieve the purpose of training.

The structure of multilayer neural network and its description

As a typical multilayer neural network.

usually a multilayer neural network consists of L layer neurons, where: the first 1 layer is called the input layer, and the last Layer (Section L Layer ) is called the output layer, and the other layers are referred to as the hidden Layer (Section 2 Layer ~ Section L-1 layer).

Make the input vector:

\[ \VECx = [x_1 \QuadX_ 2 \ Quad \ ldots \ quad x< Span style= "color:blue;" >_i \ quad Span style= "color:blue;" >\ ldots \ quad X_m],i= 1 , 2 ,..., m   \]

The output vectors are:

     \( \VECy = [y_1\QuadY_2 \ Quad  \ ldots  \ Quad  y_k \ Quad  \ ldots Span style= "color:black;" > \ quad  y _n],k= 1 Span style= "color:black;" >, 2 ,..., n \)

Section L the output of each neuron in the hidden layer is:

    \[ H^{(l)}=[h_1^{(l)}H_2^{(l)} \Quad \LdotsH_J^{ (L) } \ quad \ Ldots \ Quad H_{s _l}^{ (l) }],j= 1 Span style= "color:black;" >, 2 ,..., SL&NBSP; \]

of which, for the first L number of neurons in the layer.

set to from L-1 Layer Section J a neuron and L Layer Section I a connection weight between neurons; L Layer Section I the bias of a neuron, then:

wherein, for the l - layer i neuron input, is the activation function of the neuron. Nonlinear activation functions are usually used in multilayer neural networks, rather than linear activation functions, because multilayer neural networks based on linear activation functions are essentially superimposed on multiple linear functions, and the result is still a linear function.

Second, activation function

BP neural networks typically use the following two types of nonlinear activation functions:

The first kind is called Sigmod function or Logistics function, and the second is the hyperbolic tangent function.

Sigmod The image of the function, as shown, changes in the range of (0, 1) , whose derivative is.

The hyperbolic tangent function, as shown in the image, changes in the range of ( -1, 1) , whose derivative is.

Third, BP algorithm Derivation Process

Suppose we have m a training sample, which is the desired output for the corresponding input. the BP algorithm optimizes the input weights and biases of each layer neuron, so that the output of the neural network is as close as possible to the desired output to achieve the purpose of training (or learning).

The batch Update method is used for a given m a training sample that defines the error function as:

among them, E (i) training error for a single sample:

So

BP each iteration of the algorithm updates the weights and biases in the following ways:

among them, for the learning rate, its value range is (0, 1) . The key of BP algorithm is how to solve the partial derivative.

For a single training sample, the weight-value derivative of the output layer is calculated:

That

The same can be done,

Make:

The

for hidden layers L-1 Layer:

Because

So

Similarly

Make:

The

from the upper can be pushed, the first L the weights and biases of the layer () can be expressed as:

which

Four, BP Algorithm Process Description

The weights and biases of neural networks are updated using the batch Update method:

    1. For all layers, set, here and respectively for the full 0 matrix and all 0 vectors;
    2. for i = 1:m ,
      1. Using the inverse propagation algorithm, the gradient matrices and vectors of each layer neuron weights and biases are computed.
      2. Calculation
      3. Calculation.
    3. Update weights and offsets:
      1. Calculation
      2. Calculation.

A detailed explanation of BP neural network derivation process

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.