+c++ realization __c++ of BP neural network

Source: Internet
Author: User
Tags assert readfile strtok
0 Preface

Neural network in my impression has been relatively mysterious, just recently learned the neural network, especially the BP neural network has a more in-depth understanding, therefore, summed up the following experience, hoping to help later.
Neural networks are widely used in machine learning, such as function approximation, pattern recognition, classification, data compression, data mining and other fields. Neural network itself is a relatively large concept, from the network structure categories, there are probably: multilayer feedforward neural network, radial basis function network (RBF), Adaptive Resonant Theory Network (ART), self-organizing Mapping Network (SOM), cascade related network, Elman network, Boltzmann machine, Limited Boltzmann machine and so on.
The following picture is the most popular network structure in recent times:

Today we are going to introduce the BP neural network, it is accurate to use the BP algorithm for training multilayer feedforward Neural networks, BP algorithm is widely used. 1 Basic concepts:

1.1 Neuron Model

The neural network that is discussed in machine learning originates from the biological neural network, which in fact refers to the intersection of "neural network" and "machine learning", and the simplest m-p neuron model is shown in the following diagram:


The neuron receives input signals from other n input neurons (in the form of weighted sums), then compares them to the thresholds of neurons and processes them by activating the function to produce the output of the neuron.

1.2 Common activation functions

The activation function is the weighting and processing of all the signals transmitted by all other neurons, resulting in neuron output.

The following diagram is a commonly used activation function, the simplest is: step function, it is simple, ideal, but the nature of the worst (discontinuous/not smooth), so in practice, the most commonly used is the sigmoid function.

1.3 feedforward Neural network

   Accurate definition of multilayer feedforward neural networks: Each layer of neurons is connected to the next layer of neurons, there is no connection between the neurons, there is no cross layer connection, as shown in the following figure is a classical feedforward neural network,


(casually inserted, when the number of hidden layers in the neural network is more and more, reaching 8-9 layers, it becomes a depth
Learning model, I have seen in a paper that the network structure has up to 128 layers, about the following piece, the following will be described again. 2. Standard BP algorithm

2.0 about Gradients

   First of all, we should be aware that the gradient direction of a multivariate function is the most steep direction of the function value increase. When materialized into a 1-dollar function, the gradient direction is first along the tangent of the curve, then take the tangent upward direction for the gradient direction, 2 yuan or multivariate function, the gradient vector is the function of f for each variable derivative, the direction of the vector is the direction of the gradient, of course, the size of the vector is the size of the gradient.
   gradient descent (steepest descend method) is used to solve the maximum or minimum value of an expression, which belongs to unconstrained optimization problem. The basic idea of the gradient descent method is quite simple, now suppose we require the minimum value of the function f, first you have to select an initial point, and then the next point is produced along the gradient line direction, here is the opposite direction of the gradient (because the minimum value is asked, if the maximum value is the square of the gradient), As shown in the following illustration: 

2.1 Neural Network learning process

The neural network continuously changes the connection weights of the network under the stimulation of the external input samples, so as to make the output of the network keep close to the desired output, several key points are said:
(1) The learning process can be summarized as follows:
(2) The nature of Learning: Dynamic adjustment of threshold values of each connection weights and all functional neurons
Note: The weights and thresholds of learning can be unified into the learning of weights, the threshold value as a "dumb node", as shown in the following figure:

(3) Weight adjustment Rules: that is, in the learning process in the network of neurons in the connection right to change the basis of a certain adjustment rules, (Bp algorithm in the weight adjustment is a gradient descent strategy, the following will be described in detail)
The learning process of the BP network is shown in the following illustration:

(Baidu Library Search, can explain the problem on the line)

2.2 Weight Value adjustment strategy:

First of all, the study of neural network belongs to the category of supervised learning. One sample per input, for forward propagation (input layer → hidden layer → output layer), after the output result is calculated, the error is not up to expectations, the errors are transmitted back (output layer → hidden layer → input layer), and the value of ownership and threshold are adjusted by gradient descent strategy.

Note: The above EK is based on the K-sample data calculated by the error, you can see: The standard BP algorithm for each iteration update only for a single sample.
The adjustment formula for weights and thresholds is as follows:

The detailed derivation process of the above formula is shown in the following diagram:

summary of 2.3 bp neural network

(1) BP neural network is generally used to classify or approximate problems.
If used for classification, the activation function usually chooses the sigmoid function or the hard limit function, if it is used for function approximation, then the output layer node uses the linear function.
(2) BP neural network can use incremental learning or batch learning in training data.
-Incremental learning requires the input mode to have sufficient randomness, the noise of the input mode is more sensitive, that is, for drastic changes in the input mode, the training effect is poor, suitable for online processing.
-Batch learning does not have input mode order problem, good stability, but only for offline processing.
(3) How to determine the number of hidden layers and the number of nodes in each hidden layer
The number of hidden layer nodes in the pre is uncertain, so how much should be set to fit it (the number of hidden layer nodes has an effect on the performance of the neural network).
There is an empirical formula to determine the number of hidden layer nodes: (where H: The number of hidden layer nodes, M: The number of input layer nodes, N: The number of output-layer nodes, a: the adjustment constant between).

2.4 defect of standard BP neural network

(1) It is easy to form local minima without the global optimal value.
(using gradient descent method), if only one local minimum => global minimum, a plurality of local minima => is not necessarily the least global. This requires the initial weight and threshold requirements, to make the initial weight and threshold randomness good enough, can be achieved multiple random.
(2) Many training times make learning efficiency low, convergence speed is slow.
Each update is only for a single sample, not the same as an "offset" phenomenon.
(3) Cross fitting problem
Through continuous training, the training error reaches very low, but the test error may rise (poor generalization performance).
Resolution Policy:
1, "Early stop":
The samples are divided into training sets and validation sets, training sets are used to calculate gradients, update weights and thresholds, validation sets are used to estimate errors, and when the training set error is reduced and the validation set error increases, the weights and thresholds with the minimum validation set error are returned.
2, "regularization method": that is to add a part of the error target to describe the complexity of the network, where the parameters λ commonly used cross-validation to determine.

improvement of 2.5 BP algorithm

(1) Cumulative BP algorithm
Objective: To reduce the global error of the whole training set, and not to target a specific sample
(Update policy to make appropriate adjustments)

(2) using the momentum method to improve the BP algorithm
(Standard BP learning process is easy to oscillate, the convergence speed is slow)
To increase the momentum term, the momentum term is introduced to accelerate the convergence of the algorithm, that is, the following formula:

α is the momentum coefficient, usually 0 <α<0.9.< p>

(3) Adaptive adjustment of learning rate η
The basic guiding ideology of adjustment is: in the case of learning convergence, increase the η to shorten the learning time, when the η is too large to converge (that is, the oscillation), to reduce the η in time until the convergence. 3 Construction and C + + implementation

Experimental platform: vs2013
Project include file:

The project process is shown in the following illustration:

(1) Bp.h

#ifndef _bp_h_ #define _BP_H_ #include <vector>//Parameter setting #define LAYER 3//three-layer neural network #define NUM 10//Layer maximum number of nodes #define A 30.0 #define B 10.0//a and B are the parameters of the S-type function #define ITERS 1000//MAX Training Number #define ETA_W 0.0035//Weight adjustment rate #define ETA_B 0.001//Threshold adjustment Rate #define ERROR 0.002//single sample allowable error #defi NE ACCU 0.005//per Iteration allowable error//Type #define TYPE double #define Vector std::vector struct Data {vector<t       Ype> x;       Input attribute vector<type> y;

Output Properties};
    Class bp{public:void GetData (const vector<data>);
    void Train ();
    Vector<type> ForeCast (const vector<type>);
    void Forcastfromfile (BP * &);

    void ReadFile (const char * inutfilename,int m, int n);
    void Readtestfile (const char * inputfilename, int m, int n);

void WriteToFile (const char * outputfilename);         Private:void initnetwork ();             Initialize the network void Getnums (); Get the lossInput, output and hidden layer nodes void forwardtransfer ();  The forward propagation sub process void Reversetransfer (int);        Reverse propagation sub process void Calcdelta (int);       Calculate W and b adjustment amount void updatenetwork ();         Update weights and Threshold Type geterror (int);             Calculates the error Type GETACCU () of a single sample;   Calculates the precision type sigmoid (const type) for all samples;

Calculates the value of sigmoid void Split (char *buffer, vector<type> &vec);                 Private:int In_num;                 Input layer node number int ou_num;                 The number of output layer nodes int hd_num;            The number of hidden layer nodes vector<data> Data;  The sample data vector<vector<type>> testdata;//test data vector<vector<type>> result;            Test results int Rowlen;        Sample quantity int Restrowlen;    Test sample quantity Type W[layer][num][num];         The weight value Type b[layer][num of BP network];         The valve value of the BP network node Type X[layer][num];         The output value of each neuron is transformed by S-type function, and the input layer is the original value type D[layer][num]; Record the value of the delta in the Delta Learning rule and use the Delta rules to adjust the join weights Wij (t+1) =wij (t) kit α (Yj-aj (t)) Oi (t)};  
 #endif//_bp_h_

(2) Bp.cpp

#include <string.h> #include <stdio.h> #include <math.h> #include <assert.h> #include <

cstdlib> #include <fstream> #include <iostream> using namespace std;

#include "Bp.h"//Get training All sample data void Bp::getdata (const vector<data> _data) {data = _data;}      void Bp::split (char *buffer, vector<type> &vec) {char *p = strtok (buffer, ",");
        \ t while (P!= NULL) {vec.push_back (Atof (p));
    p = strtok (NULL, "\ n");
    } void Bp::readfile (const char * inutfilename, int m, int n) {FILE *pfile;

    Test//pfile = fopen ("D:\\testset.txt", "R");

    PFile = fopen (Inutfilename, "R");
        if (!pfile) {printf ("Open file%s failed...\n", inutfilename);
    Exit (0);
    }//init dataSet Char *buffer = new CHAR[100];

    Vector<type> temp;
        while (fgets (buffer, MB, pFile)) {Data T;
        Temp.clear ();
    Split (buffer, temp);    Data[x].push_back (temp);
            for (int i = 0; i < temp.size (); i++) {if (I < m) T.x.push_back (temp[i));
        else T.y.push_back (Temp[i]);
    } data.push_back (t);
}//init Rowlen Rowlen = Data.size ();

    } void Bp::readtestfile (const char * inputfilename, int m, int n) {FILE *pfile;
    PFile = fopen (InputFileName, "R");
        if (!pfile) {printf ("Open file%s failed...\n", inputfilename);
    Exit (0);
    }//init dataSet Char *buffer = new CHAR[100];

    Vector<type> temp;
        while (fgets (buffer, MB, pFile)) {vector<type> T;
        Temp.clear ();
        Split (buffer, temp);
        for (int i = 0; i < temp.size (); i++) {t.push_back (temp[i));
    } testdata.push_back (t);
} Restrowlen = Testdata.size ();
    } void Bp::writetofile (const char * outputfilename) {ofstream fout; FoUt.open (OutputFileName);
        if (!fout) {cout << "file Result.txt Open failed" << Endl;
    Exit (0);
    } vector<vector<type>>:: Iterator it = Testdata.begin ();
    Vector<vector<type>>::iterator ITX = Result.begin ();
        while (it!= testdata.end ()) {vector<type>:: Iterator ITT = (*it). Begin ();
        Vector<type>:: Iterator Ittx = (*ITX). Begin ();
             while (ITT!= (*it). End ()) {Fout << (*itt) << ",";
        itt++;
        } fout << "\ t";
            while (Ittx!= (*ITX). End ()) {Fout << (*ITTX) << ",";
        ittx++;
        } it++;
        itx++;
    Fout << "\ n";
    }//start training void Bp::train () {printf ("Begin to Train BP network!\n");
    Getnums ();
    Initnetwork ();

    int num = Data.size (); for (int iter = 0; ITER <= iters. iter++) {for (int cnt = 0; CNT < num; cnt++) {//First level Input node assignment for (int i = 0; i < in_num; i++) x[0][i] = data.

            at (CNT). X[i];
                while (1) {forwardtransfer ();
                if (GetError (CNT) < error)////If the error is small, jump out of the cycle break for a single sample;
            Reversetransfer (CNT);

        } printf ("This is the%d th trainning network!\n", ITER);               Type Accu = GETACCU ();
        Each round of learning the mean square error e printf ("All Samples accuracy are%lf\n", ACCU);
    if (Accu < ACCU) break;
printf ("The BP Network Train end!\n");
    ////According to the trained network to predict output value vector<type> bp::forecast (const vector<type> data) {int n = data.size ();
    ASSERT (n = = in_num);

    for (int i = 0; i < in_num i++) x[0][i] = Data[i];
    Forwardtransfer ();
    Vector<type> v;
    for (int i = 0; i < ou_num i++) V.push_back (x[2][i));
return v; } void BP:: Forcastfromfile (BP * &pbp) {vector<vector<type>>:: iterator it = Testdata.begin ();
    Vector<type> ou;
        while (it!= testdata.end ()) {OU = Pbp->forecast (*it);

        Result.push_back (OU);
    it++;                         }///Get network node number void Bp::getnums () {in_num = Data[0].x.size ();                         Gets the number of input layer nodes Ou_num = data[0].y.size ();   Gets the number of output layer nodes Hd_num = (int) sqrt ((In_num + ou_num) * 1.0) + 5;                     Gets the number of hidden layer nodes if (Hd_num > num) hd_num = num;      The number of suppressed layers cannot exceed the maximum setting}//Initialize network void Bp::initnetwork () {memset (w, 0, sizeof (w));
Initialization weights and thresholds are 0, and random values can also be initialized memset (b, 0, sizeof (b));
        }//Work signal forward transitive child process void Bp::forwardtransfer () {//Compute the output value for each node of the hidden layer for (int j = 0; J < Hd_num; j) {
        Type t = 0;
        for (int i = 0; i < in_num i++) T + = w[1][i][j] * X[0][i];
        T + + b[1][j];
    X[1][J] = sigmoid (t); }//Compute output layer nodesThe output value for (int j = 0; J < Ou_num; J + +) {Type t = 0;
        for (int i = 0; i < hd_num i++) T + = w[2][i][j] * X[1][i];
        T + + b[2][j];
    X[2][J] = sigmoid (t);
    The error type for a single sample bp::geterror (int cnt) {type ans = 0;
    for (int i = 0; i < Ou_num i++) ans + + 0.5 * (X[2][i]-data.at (CNT). Y[i]) * (X[2][i]-data.at (CNT). Y[i]);
return ans;
    ///Error signal reverse transitive sub process void Bp::reversetransfer (int cnt) {Calcdelta (CNT);
Updatenetwork ();
    ///Calculate precision Type BP::GETACCU () {type ans = 0 for all samples;
    int num = Data.size ();
        for (int i = 0; i < num i++) {int m = data.at (i). X.size ();
        for (int j = 0; J < m; j +) X[0][j] = data.at (i). X[j];
        Forwardtransfer (); int n = data.at (i). Y.size (); Dimension for sample output for (int j = 0; J < N; j +) ans + + 0.5 * (X[2][j]-data.at (i). Y[j]) * (X[2][J)-data.at
(i). Y[j]);//the mean square error of the first sample is ans/num; }//CalculationAdjust amount void Bp::calcdelta (int cnt) {//Calculate the delta value for the output layer for (int i = 0; i < ou_num; i++) d[2][i] = (x[2][i
    ]-data.at (CNT). Y[i]) * x[2][i] * (a-x[2][i))/(A * B);
        Calculates the delta value for the hidden layer for (int i = 0; i < Hd_num; i++) {Type t = 0;
        for (int j = 0; J < Ou_num + +) T + + = w[2][i][j] * D[2][j];
    D[1][i] = t * x[1][i] * (a-x[1][i))/(A * B);
    The BP network is adjusted according to the calculated adjustment void Bp::updatenetwork () {//hidden layer and output layer weights and thresholds adjustment for (int i = 0; i < hd_num; i++)
    {for (int j = 0; J < Ou_num J + +) W[2][i][j]-= Eta_w * D[2][J] * x[1][i];

    for (int i = 0; i < ou_num i++) b[2][i]-= Eta_b * D[2][i];
            The weights and thresholds between the input layer and the hidden layer are adjusted for (int i = 0; i < In_num; i++) {for (int j = 0; J < Hd_num; J +)
    W[1][I][J]-= Eta_w * D[1][J] * x[0][i];
for (int i = 0; i < hd_num i++) b[1][i]-= Eta_b * D[1][i]; ///Computes the value of the sigmoid function Type bp::sIgmoid (const Type x) {return A/(1 + exp (-x/b));}
 

(3) Test.cpp

#include <iostream>  
#include <string.h>  
#include <stdio.h>  
using namespace std;

#include "Bp.h"  

int main ()
{
    unsigned int Id, Od;    Input dimension/output dimension of sample data
    int select = 0;
    BP *bp = new BP ();
    const char * inputdataname = "exercisedata.txt";/training data file name
    const char * testdataname = "Testdata.txt";   Test data file name
    const char * outputdataname = "Result.txt";     Output file name

    printf ("Please input sample input dimension and output dimension:\n");
    scanf ("%d%d", &id, &od);
    Bp->readfile (inputdataname,id,od);

    Exercise
    Bp->train ();
    Test
    printf ("\n******************************************************\n");
    printf ("*1." Using Test files China's data test 2. Enter data from the console  \ n ");
    printf ("******************************************************\n");
    
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.