Depth Learning: The derivation of forward propagation and reverse propagation algorithms in neural networks

Last Update:2017-08-10 Source: Internet

Author: User

Keywords Propagation neural networks algorithms three-layer neural networks reverse propagation

Tags .net basic data error example function get how to

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Neural network

This is a common neural network diagram:

This is the basic composition of a common three-layer neural network, Layer L1 is the input layer, Layer L2 is the hidden layer, Layer L3 is the hidden layer, when we enter data such as X1,X2,X3, through the hidden layer of calculation, conversion, output your expectations, when your input and output is the same time, Become a Auto-encoder model, and when your input and output are inconsistent, which is what we often call artificial neural networks.

2. How to calculate the transmission

First we build a simple network layer as an example:

In this network layer there are

First layer input layer: inside contains neuron i1,i2, intercept: B1, Weight: W1,w2,w3,w4 The second layer is hidden layer: inside contains h1,h2, intercept: B2, Weight: W5,w6,w7,w8 The third layer is the output layer: it contains O1,O2

We use sigmoid as the activation function

Suppose we enter data i1:0.02 i2:0.04 intercept b1:0.4 b2:0.7 expected output data o1:0.5

The unknown is the weight w1,w2,w3,w4,w5,w6,w7,w8

Our aim is to calculate the W1,w2,w3....w8 weight value for the desired value of the o1:0.5 o2:0.9.

First, if you construct a weighted w1,w2,w3.....w8 value, you get the best w1,w2,w3....w8 weight by calculating

The initial value of the weight:

w1=0.25

w2=0.25

w3=0.15

w4=0.20

w5=0.30

w6=0.35

w7=0.40

w8=0.35

2.1 Forward Propagation

2.1.1 Input layer to hidden layer

NET (H1) =w1*i1+w2*i2+b1=0.25*0.02+0.25*0.04+0.4=0.005+0.01+0.4=0.415

The activation function of the neuron H1 to the output H1 is sigmoid

Out (H1) =1/(1+e^ (-net (H1)) =1/(1+0.660340281) =0.602286177

Similarly, we can get the value of Out (H2)

NET (H2) =w3*i1+w4*i2+b1=0.15*0.02+0.20*0.04+0.4=0.003+0.008+0.4=0.411

Out (H2) =1/(1+e^ (-net (H2)) =1/(1+0.662986932) =0.601327636

2.1.2 from hidden layer to output layer

Computes the neuron O1 of the output layer, the value of the O2, and the calculation method is similar to the output layer to the hidden layer

NET (O1) =w5*h1+w6*h2+b2=0.3*0.602286177+0.35*0.601327636+0.7=0.180685853+0.210464672+0.7=1.091150525

Out (O1) =1/(1+e^ (-net (O1)) =1/(1+0.335829891) =0.748598311

Empathy

NET (O2) =w7*h1+w8*h2+b2=0.4*0.602286177+0.35*0.601327636+0.7=0.240914471+0.210464672+0.7=1.151379143

Out (O2) =1/(1+e^ (-net (O2)) =1/1.316200383=0.759762733

o1:0.748598311 o2:0.759762733 distance from our desired o1:0.5 o2:0.9.

2.2 Calculating the total error

Formula：

That is, we need to calculate each desired error and

E (total) = e (o0) +e (O1) = (1/2) * (0.748598311-0.5) ^2+ (1/2) * (0.759762733-0.9) ^2=0.01545028+0.009833246=0.025283526

2.3 Reverse Propagation

The impact of each weight on the error, we can see the following figure more intuitive to understand the reverse transmission of errors

2.3.1 The power value of hidden layer to output layer

The weighted value of the hidden layer to the output layer, in the example above is W5,w6,w7,w8

We take the W6 parameter as the example, calculates the W6 to the overall error the influence to have how big, may use the whole error to the W6 parameter derivation:

Obviously there is no W6 formula for Etotal, we only have W6 calculation formula for net (O1)

But according to the chain law of the partial derivative, we can multiply the derivation formula of our existence by the chain type.

Let's calculate the bias of each formula:

Calculation

：

This is the derivative of a composite function.

If {\displaystyle F} and {\displaystyle g} are two related to {\displaystyle x}-guided functions, the derivative of the composite function {\displaystyle (F\circ g) (x)} is {\displaystyle ( F\circ g) ' (x)} is:

{\displaystyle (F\circ g) ' (x) =f ' (g (x)) G ' (x).}

Here g (x) =target (O1) the-out (O1) G ' (x) =-1

=-(0.5-0.748598311) =0.248598311

Calculation

Known

Let's deduce

：

Or the derivation of compound functions

The result of the final derivation:

=0.748598311* (1-0.748598311) =0.251401689*0.748598311=0.18819888

Calculation

That is, net (O1) ' =out (H2) =0.601327636

Finally our formula

= *out (H2)

=0.248598311*0.18819888*0.601327636=0.028133669

The weight of 2.3.1.1 and the new W6.

w6=w6-x*

where x is what we often say learning rate, set X learning rate to 0.1 so the weight of the new W6 is

0.35-0.1*0.028133669=0.347186633

For the same reason, we can also calculate the weight of the new W5,w6,w7,w8

But how to calculate and with the new w1,w2,w3,w4 weight?

The weight of 2.3.2 hidden layer and new

The approximate algorithm is similar to the previous one, as shown in the following figure:

Calculation formula:

2.3.2.1 calculation

For Out (H1) Etotal does not rely on out (H1) calculations, it needs to split the total into two Eo1 and EO2 to compute

The formula is as follows:

Then deduce the formula:

Calculation

Similarly, you can calculate

2.3.2.2 Calculation

2.3.2.3 calculation

The last three are multiplied by:

2.3.2.4 Integral formula

According to the previous formula, we can deduce the final formula

The weight of 2.3.2.4 and the new W1.

The same as the weight of the calculated W6:

Set the learning rate and calculate the weight value of the W1

3. Calculate get the best weight

We will get the new weight of the iterative, iterative a certain number of times until close to expectations o1:0.5 o2:0.9, the weight w1...w8, is the weight required.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More