A detailed explanation of BP neural network derivation process

Last Update:2015-06-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

BP algorithm is one of the most effective multi-layer neural network learning methods, its main characteristic is the signal forward transmission, and the error after the propagation, through the constant adjustment of the network weight value, so that the final output of the network and the desired output as close as possible to achieve the purpose of training.

The structure of multilayer neural network and its description

As a typical multilayer neural network.

usually a multilayer neural network consists of L layer neurons, where: the first 1 layer is called the input layer, and the last Layer (Section L Layer ) is called the output layer, and the other layers are referred to as the hidden Layer (Section 2 Layer ~ Section L-1 layer).

Make the input vector:

\[ \VECx = [x_1 \QuadX_ 2 \ Quad \ ldots \ quad x< Span style= "color:blue;" >_i \ quad Span style= "color:blue;" >\ ldots \ quad X_m],i= 1 , 2 ,..., m \]

The output vectors are:

\( \VECy = [y_1\QuadY_2 \ Quad \ ldots \ Quad y_k \ Quad \ ldots Span style= "color:black;" > \ quad y _n],k= 1 Span style= "color:black;" >, 2 ,..., n \)

Section L the output of each neuron in the hidden layer is:

\[ H^{(l)}=[h_1^{(l)}H_2^{(l)} \Quad \LdotsH_J^{ (L) } \ quad \ Ldots \ Quad H_{s _l}^{ (l) }],j= 1 Span style= "color:black;" >, 2 ,..., SL&NBSP; \]

of which, for the first L number of neurons in the layer.

set to from L-1 Layer Section J a neuron and L Layer Section I a connection weight between neurons; L Layer Section I the bias of a neuron, then:

wherein, for the l - layer i neuron input, is the activation function of the neuron. Nonlinear activation functions are usually used in multilayer neural networks, rather than linear activation functions, because multilayer neural networks based on linear activation functions are essentially superimposed on multiple linear functions, and the result is still a linear function.

Second, activation function

BP neural networks typically use the following two types of nonlinear activation functions:

The first kind is called Sigmod function or Logistics function, and the second is the hyperbolic tangent function.

Sigmod The image of the function, as shown, changes in the range of (0, 1) , whose derivative is.

The hyperbolic tangent function, as shown in the image, changes in the range of ( -1, 1) , whose derivative is.

Third, BP algorithm Derivation Process

Suppose we have m a training sample, which is the desired output for the corresponding input. the BP algorithm optimizes the input weights and biases of each layer neuron, so that the output of the neural network is as close as possible to the desired output to achieve the purpose of training (or learning).

The batch Update method is used for a given m a training sample that defines the error function as:

among them, E (i) training error for a single sample:

BP each iteration of the algorithm updates the weights and biases in the following ways:

among them, for the learning rate, its value range is (0, 1) . The key of BP algorithm is how to solve the partial derivative.

For a single training sample, the weight-value derivative of the output layer is calculated:

That

The same can be done,

Make:

The

for hidden layers L-1 Layer:

Because

Similarly

Make:

The

from the upper can be pushed, the first L the weights and biases of the layer () can be expressed as:

which

Four, BP Algorithm Process Description

The weights and biases of neural networks are updated using the batch Update method:

For all layers, set, here and respectively for the full 0 matrix and all 0 vectors;
for i = 1:m ,
1. Using the inverse propagation algorithm, the gradient matrices and vectors of each layer neuron weights and biases are computed.
2. Calculation
3. Calculation.
Update weights and offsets:
1. Calculation
2. Calculation.

A detailed explanation of BP neural network derivation process

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A detailed explanation of BP neural network derivation process

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A detailed explanation of BP neural network derivation process

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support