Realization of BP neural network __c++ from zero in C + +

Source: Internet
Author: User
Tags andrew ng machine learning

This paper is reproduced from http://blog.csdn.net/ironyoung/article/details/49455343



BP (backward propogation) neural network
Simple to understand, neural network is a high-end fitting technology. There are a lot of tutorials, but in fact, I think it is enough to look at Stanford's relevant learning materials, and there are better translations at home: Introduction to Artificial neural network, direct translation and Stanford Tutorial: "Neural network-UFLDL" BP principle, direct translation and Stanford Tutorial: "Reverse conduction algorithm-UFLDL" Online Public Lesson notes: Andrew Ng Machine Learning Special topic "Neural Networks"

Three articles, detailed mathematical deduction is already inside, do not repeat. Here is a note of some of the summary and errors I encountered during the implementation process.

The process of neural networks
To put it simply, I have a bunch of known input vectors (each vector may have multidimensional), each time reading a vector (possibly multidimensional), each feature dimension becomes an input node in the input layer of the above figure. The value of each dimension, assigning a portion of itself to the hidden layer by weight. What about the first weight when you write a program? In fact ( -1,1) randomization (not 0~1 randomization) is good, the follow-up will be gradually corrected. So the hidden layer of each node also has its own numerical value, the same reason, multiplied by the weight of the activation function (according to the requirements of choice, classification problems are generally sigmoid functions, numerical fitting is generally purelin function, there are special functions), finally passed to the output node. Each value of each output node corresponds to a characteristic dimension of the output vector

At this point, we have completed a forward pass process, the direction is: Input layer ⇒ output layer. Students familiar with the neural network must know that the use of neural networks have "training", "testing" two parts. We now consider the training process. After each forward pass process, there is a difference between the value of the output layer and the true value, which is δ. At this point we pass the error as a parameter to the hidden layer node according to the formula. What is the use of these errors? Remember the random weights between our layers? is used to correct this weight. Similarly, to modify the weights between the input layer and the hidden layer, our perspective reaches the input layer.

At this point, we have completed a backward pass process, the direction is: Input layer ⇐ output layer.

The first set of samples was finished, namely forward pass + backward pass.
What happens next. A set of the second sample is made, and the error is added to the previous sample error, a third sample set, plus error; A set of n samples, plus error. Wait until all samples have been passed over, looking at the error and whether it is less than the threshold (freely set according to the actual situation). If not less than the next set of samples, namely: Clear 0 error, the first sample, plus error, the second sample, plus error; Nth sample, plus error. Error and whether it is less than the threshold value ... Error and reach threshold, well, no training.

At this point, enter a test sample, the values of each feature dimension input to the input layer node, once forward pass, the output value is our forecast value.

Easy wrong point
Since this is so easy to understand, why is there an error in the implementation? Here are a few of the errors encountered: the input node, which is a node of the feature dimension of each sample. or one node per sample. It is wrong to think that each sample corresponds to an output node. The answer is an input node for each feature; bias is essential. Bias is a numerical offset that is not affected by the upper layer of neurons, and after each neuron summarizes the information in the upper layer, it needs to be offset and then input into the activation function. It is explained in the opening tutorial. For example, if we learn the XOR problem, 2 input nodes are 0, 0, if there is no bias all the hidden layer nodes are the same value, resulting in a symmetric failure problem, how many hidden layers of the neural network, how many neurons in each hidden layer, learning efficiency, all need to be debugged. No definite solution, but to ensure that each cycle, the sample error and a downward trend

C + + implementation code:

#pragma once #include <iostream> #include <cmath> #include <vector> #include <stdlib.h> #include

<time.h> using namespace std; #define INNODE 2//input knot number #define HIDENODE 4//hidden knot points #define Hidelayer 1//Hidden layer #define OUTNODE 1//Loss #define LEARNINGRATE 0.9//Learning rate, alpha//----1~1 random number generator---inline double get_11random ()//-1 ~ 1 {return
((2.0* (double) rand ()/rand_max)-1);
    }//---sigmoid function---inline double sigmoid (double x) {Double ans = 1/(1+exp (-X));
return ans; }//---Input layer node. Contains the following components:---//1.value: fixed input value;//2.weight: Facing the first layer of hidden layer each node has the right value;//3.wDeltaSum: Delta value cumulative typedef str for each node weight in the first layer of hidden layer
    UCT Inputnode {Double value;
vector<double> weight, wdeltasum;

}inputnode; ---output layer node. Contains the following values:---//1.value: node current value;//2.delta: Delta value between the correct output value;//3.rightout: Correct output/4.bias: Offset//5.bDelt Asum:bias Delta value accumulation, each node a typedef struct OUTPUTNODE//Output layer node {Double value,Delta, Rightout, bias, bdeltasum;

}outputnode; ---The hidden layer node. Contains the following values:---//1.value: node current value;//2.DELTA:BP deduced delta value;//3.bias: Offset//4.bdeltasum:bias delta value Cumulative, each node A //5.weight: Face the next layer (hidden layer/output layer) Each node has the right value;//6.wdeltasum:weight Delta value accumulation, facing the next layer (hidden layer/output layer) each node accumulated typedef struct HIDDENNO
    De//Hidden layer node {Double value, delta, bias, bdeltasum;
vector<double> weight, wdeltasum;

}hiddennode; ---a single sample---typedef struct SAMPLE {vector<double> in;}

Sample    ---BP neural network---class Bpnet {public:bpnet ();  constructor void Forwardpropagationepoc ();     A single sample forward propagation void Backpropagationepoc (); A single sample is propagated void training (static vector<sample> Samplegroup, double threshold);/update weight, bias void pre                          Dict (vector<sample>& testgroup);     The neural network predicts void SetInput (static vector<double> Samplein);    Set up learning sample input void Setoutput (static vector<double> sampleout); Set up LearningSample output public:double error;                      inputnode* Inputlayer[innode];                   Input layer (only one layer) outputnode* Outputlayer[outnode];       Output layer (only one layer) hiddennode* Hiddenlayer[hidelayer][hidenode]; Hidden layer (may have multiple layers)};
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26-27--28 29---30 31--32 33 34 35 36 37 38-39 40 41 42 4 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 85 86 87 88 89 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21-22--23 24---25 26--27 28 29 30 31 32 33-34 35 36 37 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 0 81 82 83 84 85 86 87 88 89 then is the source file of the BP neural network:
#include "BPnet.h" using namespace std;        Bpnet::bpnet () {srand (unsigned) time (NULL));                      Random number seed error = 100.F;
        Error initial value, maximum can//Initialize input layer for (int i = 0; i < Innode i++) {Inputlayer[i] = new Inputnode ();
            for (int j = 0; J < Hidenode; J + +) {Inputlayer[i]->weight.push_back (Get_11random ());
        Inputlayer[i]->wdeltasum.push_back (0.F);
            }//Initialize the hidden layer for (int i = 0; i < Hidelayer; i++) {if (i = = hidelayer-1) {
                for (int j = 0; J < Hidenode J + +) {Hiddenlayer[i][j] = new Hiddennode ();
                Hiddenlayer[i][j]->bias = Get_11random (); for (int k = 0; k < Outnode; k++) {Hiddenlayer[i][j]->weight.push_back (get_11r
                    Andom ());
                Hiddenlayer[i][j]->wdeltasum.push_back (0.F);
  }
            }      else {for (int j = 0; J < Hidenode; J + +) {Hiddenlayer I
                [j] = new Hiddennode ();
                Hiddenlayer[i][j]->bias = Get_11random ();
            for (int k = 0; k < Hidenode; k++) {Hiddenlayer[i][j]->weight.push_back (Get_11random ());} 
        }}///Initialize output layer for (int i = 0; i < Outnode i++) {Outputlayer[i] = new Outputnode ();
    Outputlayer[i]->bias = Get_11random (); } void Bpnet::forwardpropagationepoc () {//forward propagation on hidden layer for (int i = 0; i < Hidelaye R i++) {if (i = = 0) {for (int j = 0; J < Hidenode + +) {D
                ouble sum = 0.f; for (int k = 0; k < Innode; k++) {sum + + Inputlayer[k]->value * inputlayer[k]-
                >weight[j];
   Sum + + hiddenlayer[i][j]->bias;             Hiddenlayer[i][j]->value = sigmoid (sum); } else {for (int j = 0; J < Hidenode; J + +) {Double
                sum = 0.F; for (int k = 0; k < Hidenode; k++) {sum + + Hiddenlayer[i-1][k]->value * Hidden
                layer[i-1][k]->weight[j];
                Sum + + hiddenlayer[i][j]->bias;
            Hiddenlayer[i][j]->value = sigmoid (sum);  '}}//forward propagation on output layer for (int i =

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.