The popular explanation of BP neural network

Source: Internet
Author: User
Tags ibm developerworks

BP (backward propogation) neural network is widely used as a neural network. To me, neural networks are a high-end interpolation technique. The corresponding better implementation tutorials are:

    1. MATLAB Toolbox version (easy to use, but not for understanding principle): Talk about Ann (2): BP neural network;
    2. The MATLAB principle realizes (according to the principle realization version, does not use the Neural Network Toolbox): The Simple and easy learning machine learning algorithm-neural network BP neural network;
    3. C + + principle implementation (based on principle): BP neural network principle and C + + combat
Three articles, 2nd and 3, apply to the principle of understanding. The detailed mathematical deduction is already in the inside, will not repeat. Here are some of the mistakes and summaries I encountered in the implementation process:

The process of neural networks. See:

Simply put, I have a bunch of known input vectors (each vector may have multidimensional), one vector at a time (there may be multiple dimensions), and each dimension becomes an input node in the input layer.

The numeric value of each dimension, sending a portion of itself (according to the weight assignment, its own choice of transfer function) to the hidden layer. When writing a program it is easy to ask: What about the first-time weights? In fact ( -1,1) randomization is good, the follow-up will be gradually revised.

In this way, the hidden layer each node also has its own values, the same reason, passed to the output node. Each value of each output node corresponds to one dimension of the output vector.

At this point, we completed a forward pass process, the direction is: the input layer and the output layer.


Students familiar with the neural network must know that the use of neural networks have "training", "test" two parts. We are now considering the training process. After each forward pass process, there is a difference between the value of the output layer and the real value, and this difference is recorded as error errors. At this point we pass the error as an argument to the hidden layer node based on the formula (refer to the link tutorial above, using a different transfer function).

What is the use of these errors? Remember the random weights between our layers? is used to correct this weight, the specific derivation process is shown in the link tutorial above. Similarly, the weights between the input layer and the hidden layer are modified, and our perspective reaches the input layer.

At this point, we completed a backward pass process, the direction is: Input layer <= output layer.


The first sample was finished, namely forward pass + backward pass. What's next? Make a second sample, and add the error of this layer to the error of the previous layer.

Then, a third sample of a set, plus error; A set of nth samples, plus an error. At this point, all the samples go through, and see if the error and whether it is less than an acceptable value (this value is set freely according to the actual situation). If not less than, then the next set of major health care, namely:

0 error, one set of the first sample, plus error, a set of the second sample, plus error; A set of nth samples, plus an error. The error and error and whether it is less than an acceptable value. If not less than, then the next set of large health care.

The error and the attainment of the requirement, duly, did not train. At this time, the test data, input a test sample, the values of each dimension into the node, a forward pass (test sample only input, can not proofread the correctness of the output). The resulting output value is our predictive value .


The magic of neural networks is in so many places:

    1. Transfer function Selection: The value of the previous layer of each node after the aggregation of weights, as the input of the transfer function, the output value of the function is the current layer current node value. This transfer function, is a linear function? Two times function? Or a unary function? binary function? are different in nature and apply to different scenarios.
    2. Number of hidden layer nodes: All by Monte ... The choice is too slow, the election is not enough to deliver the message.
    3. Hidden layers: Although the legend says that there is only one layer of the situation, but not that can not be multilayer. Multilayer own multi-layered benefits, such as deep learning, is generally seven or more.
Since it's so easy to understand, why is it that there are errors in the implementation? Let me tell you a few of the mistakes I've encountered:

    1. Input node, exactly is the dimension of each vector a node? or a sample vector of a node? Especially in the case of the situation, especially easy to cause misunderstanding. It is actually wrong to think that a sample corresponds to an output node . For input data of a multidimensional dimension, for example:
    2. It is wrong to treat the error exactly as one sample per training, adjusted to optimal . The error should stand in the view of the entire learning sample , requiring the cumulative error of the entire training sample, and then the multiple cycles of all samples.

Finally, we recommend three IBM DeveloperWorks above the relevant principles of the article, it is very clear:

    1. Neural Network Introduction: Pattern learning using the inverse propagation algorithm
    2. Ai Java Tank Robot Series: Neural Network, Upper
    3. Ai Java Tank Robot Series: Neural network, lower part

Popular explanation of BP neural network

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.