The popular explanation of BP neural network

Last Update:2015-04-03 Source: Internet

Author: User

Tags ibm developerworks

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

BP (backward propogation) neural network is widely used as a neural network. To me, neural networks are a high-end interpolation technique. The corresponding better implementation tutorials are:

MATLAB Toolbox version (easy to use, but not for understanding principle): Talk about Ann (2): BP neural network;
The MATLAB principle realizes (according to the principle realization version, does not use the Neural Network Toolbox): The Simple and easy learning machine learning algorithm-neural network BP neural network;
C + + principle implementation (based on principle): BP neural network principle and C + + combat

Three articles, 2nd and 3, apply to the principle of understanding. The detailed mathematical deduction is already in the inside, will not repeat. Here are some of the mistakes and summaries I encountered in the implementation process:

The process of neural networks. See:

Simply put, I have a bunch of known input vectors (each vector may have multidimensional), one vector at a time (there may be multiple dimensions), and each dimension becomes an input node in the input layer.

The numeric value of each dimension, sending a portion of itself (according to the weight assignment, its own choice of transfer function) to the hidden layer. When writing a program it is easy to ask: What about the first-time weights? In fact ( -1,1) randomization is good, the follow-up will be gradually revised.

In this way, the hidden layer each node also has its own values, the same reason, passed to the output node. Each value of each output node corresponds to one dimension of the output vector.

At this point, we completed a forward pass process, the direction is: the input layer and the output layer.

Students familiar with the neural network must know that the use of neural networks have "training", "test" two parts. We are now considering the training process. After each forward pass process, there is a difference between the value of the output layer and the real value, and this difference is recorded as error errors. At this point we pass the error as an argument to the hidden layer node based on the formula (refer to the link tutorial above, using a different transfer function).

What is the use of these errors? Remember the random weights between our layers? is used to correct this weight, the specific derivation process is shown in the link tutorial above. Similarly, the weights between the input layer and the hidden layer are modified, and our perspective reaches the input layer.

At this point, we completed a backward pass process, the direction is: Input layer <= output layer.

The first sample was finished, namely forward pass + backward pass. What's next? Make a second sample, and add the error of this layer to the error of the previous layer.

Then, a third sample of a set, plus error; A set of nth samples, plus an error. At this point, all the samples go through, and see if the error and whether it is less than an acceptable value (this value is set freely according to the actual situation). If not less than, then the next set of major health care, namely:

0 error, one set of the first sample, plus error, a set of the second sample, plus error; A set of nth samples, plus an error. The error and error and whether it is less than an acceptable value. If not less than, then the next set of large health care.

The error and the attainment of the requirement, duly, did not train. At this time, the test data, input a test sample, the values of each dimension into the node, a forward pass (test sample only input, can not proofread the correctness of the output). The resulting output value is our predictive value .

The magic of neural networks is in so many places:

Transfer function Selection: The value of the previous layer of each node after the aggregation of weights, as the input of the transfer function, the output value of the function is the current layer current node value. This transfer function, is a linear function? Two times function? Or a unary function? binary function? are different in nature and apply to different scenarios.
Number of hidden layer nodes: All by Monte ... The choice is too slow, the election is not enough to deliver the message.
Hidden layers: Although the legend says that there is only one layer of the situation, but not that can not be multilayer. Multilayer own multi-layered benefits, such as deep learning, is generally seven or more.

Since it's so easy to understand, why is it that there are errors in the implementation? Let me tell you a few of the mistakes I've encountered:

Input node, exactly is the dimension of each vector a node? or a sample vector of a node? Especially in the case of the situation, especially easy to cause misunderstanding. It is actually wrong to think that a sample corresponds to an output node . For input data of a multidimensional dimension, for example:
It is wrong to treat the error exactly as one sample per training, adjusted to optimal . The error should stand in the view of the entire learning sample , requiring the cumulative error of the entire training sample, and then the multiple cycles of all samples.

Finally, we recommend three IBM DeveloperWorks above the relevant principles of the article, it is very clear:

Neural Network Introduction: Pattern learning using the inverse propagation algorithm
Ai Java Tank Robot Series: Neural Network, Upper
Ai Java Tank Robot Series: Neural network, lower part

Popular explanation of BP neural network

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More