Deep Learning Foundation--Neural network--bp inverse propagation algorithm

Last Update:2017-04-16 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

BP algorithm:

　　1. is a supervised learning algorithm, often used to train multilayer perceptron.

2. The excitation function required for each artificial neuron (i.e. node) must be micro-

(Excitation function: the function relationship between the input and output of a single neuron is called the excitation function.) ）

(If the excitation function is not used, each layer in the neural network is simply a linear transformation, and the multilayer input is also a linear transformation after stacking.) Because the linear model is not expressive enough , the excitation function can introduce nonlinear factors.

The following two images are: neural networks with no excitation function and neural networks for excitation functions

, the difference after adding a nonlinear activation function: To split a plane with a linear combination approximation to a smoothing curve, to split a plane with a smooth curve.

　　3. This algorithm is particularly suitable for training feedforward neural networks.

feedforward Neural network: A kind of artificial neural network. In this kind of neural network, each neuron starts from the input layer, receives the previous input, and enters to the next level until the output layer. There is no feedback in the whole network, can be represented by a direction-free graph. According to the number of layers, it can be divided into single-layer feedforward neural network and multilayer feedforward neural network. Common Feedforward neural networks are: perceptual machine (perceptions), BP (back propagation reverse propagation) network, RBF (Radial Basis Function path basis function) network, etc.

The neural network that is commonly seen is this:

When calculating: The output of the signal from the input---"is passed forward."

So where does the feedback come from?

When our network is not well trained, the output must be different from the imagination, when the deviation, and the deviation of the first-level forward transmission, layer by level, this is feedback. The feedback is used to obtain partial derivative, and the partial derivative is used for gradient descent, and the gradient descent is to obtain the minimum value of the cost function, so that the error between the expectation and the output is as low as possible. (You can also define a cost function, you may not need feedback.)

4. Mainly by two links (excitation propagation, weight update) iterative iteration, until the network response to input to reach a predetermined target range.

　　Incentive communication:

The propagation link in each iteration consists of two steps:

1. (Forward propagation phase) The training input is fed into the network to obtain an excitation response.

2. (Reverse propagation phase) The excitation response and the training input corresponding to the target output is poor, so as to obtain the response error of the hidden layer and the output layer.

Weight update:

For the weights on each synapse, follow these steps to update:

1. Multiply the input excitation and response error to get the gradient of the weights.

2. Multiply this gradient by a percentage (this ratio will affect the speed and effect of the training process and is therefore called the ' Training Factor '. The direction of the gradient indicates the direction of the error enlargement , so it is necessary to reverse the weight when it is updated, thereby reducing the error caused by the weight, and adding the weights to the inverse.

Comprehensive:

　　The idea of a reverse propagation algorithm is as follows: Given a sample, we first perform a "forward conduction" operation to calculate all the activation values in the network, including the output values. Then, for each node of the first layer, we calculated its "residuals", which indicate how much of the image the node produces for the residual of the final output value. For the final output node, we can directly calculate the difference between the activation value generated by the network and the actual value, and we define the gap as (the first layer represents the output layer). What do we do with hidden units? We will compute the weighted average of the residuals based on the node (translator Note: Layer node) as input.

Important Reference http://www.cnblogs.com/Crysaty/p/6126321.html

Deep Learning Foundation--Neural network--bp inverse propagation algorithm

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Deep Learning Foundation--Neural network--bp inverse propagation algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Deep Learning Foundation--Neural network--bp inverse propagation algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support