Study on BP neural network algorithm

Source: Internet
Author: User

The BP (back propagation) network was presented by a team of scientists, led by Rumelhart and McCelland in 1986, and is a multi-layered feedforward network trained by error inverse propagation algorithm, which is one of the most widely used neural network models. The BP network can learn and store a large number of input-output pattern mapping relationships without having to reveal the mathematical equations of the mapping relationship described in advance.

The structure of a neural network, for example, is seen below.

The BP neural network model topology includes the input layer, the hidden layer, and the output layer. The number of input layer neurons is determined by the dimension of the sample attributes, and the number of neurons in the output layer is determined by the number of sample categories. The number of layers in the hidden layer and the number of neurons per layer are specified by the user. Each layer consists of several neurons, each of which includes a threshold that is used to alter the activity of the neuron. The arcs in the network represent the weights between the previous neuron and the posterior layer of neurons. Each neuron has input and output. Inputs and outputs of the input layer are the attribute values of the training sample.

For the input of the hidden layer and output layer, it is the right of the connection of the cell I to the cell J of the previous layer, the output of unit I of the previous layer, but the threshold value of Unit J.

The output of neurons in neural networks is calculated by the activity function. The function uses symbols to represent the neuron activity represented by the unit. The active function generally uses the Simoid function (or the logistic function). The output of the neuron is:

In addition, there is a concept of learning rate (l) in a neural network, which usually takes a value between 0 and 1 and helps to find the global minimum. If the learning rate is too small, learning will go very slowly. Assuming that the learning rate is too large, there may be a swing between today's inappropriate solutions.

Explaining the basic elements of the neural network, let's take a look at the learning process of the BP algorithm:

Bptrain () {

Initializes the network's rights and thresholds.

While termination condition does not meet {

For each training sample in the for samples x {

Propagate input forward

For hide or output layer per cell J {

;//relative to the previous layer I, the net input of the calculated cell J;//The output of the calculated cell J

}

Back propagation Error

For output layer per cell J {

;//Calculation error

}

For the last one to the first hidden layer, for each cell of the hidden layer J {

;//k is the neuron in the next layer of J

}

Every right in the for network {

;//Right to add value

;//Rights update

}

Each deviation in the for network {

;//deviation increment

;//deviation update

}

}

}

The basic flow of the algorithm is:

1, initialize the network weights and neurons threshold (the simplest way is to initialize randomly)

2. Forward propagation: The input and output of the hidden-layer neurons and the output-layer neurons are calculated according to the formula layer.

3. Forward propagation: Correcting weights and thresholds based on formula

Until the termination condition is met.

There are several points to be explained in the algorithm:

1, about, is the error of the neuron.

For output-layer neurons, the actual output of cell J is the true output of J based on the known class designator for a given training sample.

For the hidden layer neurons, it is the connection right of unit K to Unit J in the next higher layer, but the error of unit K.

The increment of weight is the increment of the threshold, which is the learning rate.

For the derivation, a gradient descent algorithm is used. The precondition of derivation is to ensure the minimum mean variance of the output unit. , where p is the total number of samples, m is the output layer neuron number is the actual output of the sample, is the neural network output.

The gradient descent idea is the derivative of the finding.

For the output layer:

Among them is.

For hidden layers:

This is the formula for calculating the error of the hidden layer.

2, regarding the termination condition, can have many forms:

§ All previous cycles are too small to be less than a specified threshold.

§ The percentage of samples that were not correctly categorized in the previous week is less than a certain threshold.

§ exceeds the number of pre-specified cycles.

§ The mean square error of the output and actual output values of the neural network is less than a certain threshold value.

In general, the accuracy of the last termination condition is higher.

In the actual use of the BP neural network process, there will be some practical problems:

1, sample processing. For the output, assuming that there are only two types then the output is 0 and 1, only when it tends to positive or negative infinity will output 0, 1. So the conditions can be properly relaxed, the output >0.9 feel is 1, output <0.1 feel is 0. For input, the sample also needs to be normalized.

2, the choice of network structure. Mainly refers to the number of hidden layers and the number of neurons determines the network size, network size and performance learning effect is closely related. Large scale, large computational capacity, and may lead to overfitting, but small size can also lead to less than fit.

3, the initial weight value, the choice of threshold value, the initial value of learning results are affected, it is important to choose a suitable initial value.

4, incremental learning and bulk learning. The above algorithm and mathematical deduction are based on bulk learning, bulk learning is suitable for offline learning, learning effect stability; Incremental learning is used for online learning, which is sensitive to noise in input samples and is not suitable for drastic changes in input patterns.

5, there are other options for the excitation function and the error function.

In general, the choice of BP algorithm is much more, for specific training data often have a larger optimization space.

Study on BP neural network algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.