Collective Smart Programming Reading Notes 3-Neural Networks

Source: Internet
Author: User

Use a neural network to create a page that responds to a search keyword

For a search engine, each user can click only a search result, instead of other content, to provide the engine with information about his or her preferences for the search results in a timely manner.

Therefore, we can construct a neural network to provide the network with the words to be queried, the search results returned to users, and the click decisions of users, and then train the neural network.

Figure 1 multi-layer Sensor

A network composed of multiple layers of neurons is called a multilayer perception (MLP ).

Why do we need to use such a complex neural network instead of simply recording the query conditions and the number of times each search result is clicked?

Because the power of a neural network is that it can give reasonable guesses about the query results that have never been seen before based on similarity with other query results.

To implement this neural network, we need to create several tables.

① Ing table between word and URL

② Hiddennode table

③ Word2hiddennode table

④ Hiddennode2url table

Feed-forward method

The feed-forward method refers to positive network propagation.

For example, we have a list of three words and a list of three urls:

Figure 2 Input and Output lists

Now we enter word1 and word3, which means that the input nodes word1 and word3 are activated and their output is 1. Now, we create a new hidden layer node (assuming that a nonexistent word query is not executed, a new hidden layer node is generated). The input of the hidden layer node is the output of the input layer node:

Figure 3 input activated

In addition to the input from the previous node, an important parameter for each neuron connection isWeight. Here, we assume that the word2hiddennode weight is 1/Len (words), and The hiddennode2url weight is 0.1, so there are:

Figure 4 weighted neuron connection

Now, our task is to determine the three "?" What is the corresponding value. Obviously, they are output values of hiddennode. The output value of hiddennode is calculated as follows:

Formula 1 Calculation of output values of neurons

In layman's terms, the value of each input connection is multiplied by the weight of the connection. After the sum, a tanh function (such as) is used to obtain the result.

Figure 5 Tanh Functions

In this example, value = Tanh (1*0.5 + 1*0.5) = Tanh (1) = 0.76

Now, we know "?" in Figure 4. Is 0.76. Next, we can calculate the output values of each URL.

The calculation method is also Formula 1, so

Url1_output = Tanh (0.76*0.1) = Tanh (0.076) = 0.076

Similarly, url2_output = url3_output = 0.076

The above is the calculation process of the entire feed-forward method. The key is the setting of the weight of the neuron connection and the response function after the sum (tanh function is used in this example ). The response function is used to indicate the response degree of each node to the input. Generally, the S-type function (Tanh is a type of S-type function) is selected ).

Now, through the feed-forward method, we can see that after the network is stimulated by word1 and word3, the response and output of the entire network are as follows:

Figure 6 Feed-forward method output

Maybe you may wonder why the output is the same? When the output is the same, how do we know which URL should be placed at the top? Don't worry. This is because the network has not been trained. I don't know which URL should be output under a combination of word1 and word3. We can use some samples to train it so that it can adjust the weights of connections between neurons to differentiate URLs (which is highly relevant and which is less relevant).

PS: There is a printing error in this part of the book, that is, the output of the feed-forward method should be (0.076, 0.076, 0.076), instead of (0.76, 0.76, 0.76 )...... It was a printing error.

Reverse Propagation

Through the introduction of the feed-forward method, we know that if we do not tell the neural network what is correct and what is wrong, we can simply use the computing power of the network, it is insufficient to distinguish the differences between outputs. The interesting thing about a neural network is that it can be trained by adjusting the weights of connections between neurons, this allows you to make similar choices for unknown input and training samples (training samples are used to train weights ).

Now let's try a reverse propagation to adjust the weights of each connection in the neural network.

Let's review section 6. Assume that the output of url1 is correct (that is, url1 is expected when the combination of word1 and word3 is input). Then we can see that our goal is url1, that is, target [url1] = 1, and other targe [urlx] = 0, such:

Figure 7 provides a training sample for the neural network

Through the training sample, we can calculate the error size. The formula for calculating the error is:

Error = target-url_output

Formula 2 formula for calculating the output layer error

There are:

Figure 8 error calculation at the output layer

After knowing the output error of each URL, we can calculate the error gradient of the output values of each neuron layer by layer.

Error gradient calculation of the output layerFormula, where k is the K element of the output layer:

Formula 3 formula for calculating the error gradient at the output layer

In this way, we can calculate the error gradient values of the three URL outputs:

Output_delta [1] = 1/(1-0.0762) * 0.924 = 0.9294

Output_delta [2] = 1/(1-0.0762) * (-0.076) =-0.0764

Output_delta [3] = 1/(1-0.0762) * (-0.076) =-0.0764

After calculating the error gradient of the output layer, we can calculate it.Hidden Layer error gradient, The pseudo Calculation MethodCodeYes:

Hidden_deltas = [1, 0.0] *Len (hiddenids)ForJ range (LEN (hiddenids): Error= 0.0ForKInRange (LEN (url_ids): Error= Error + output_delta [k] *Weight_of_output [J] [k] hidden_delta [J]= (1/(1-hiddennode_value [J]) * Error

That is

Formula 4 Calculation Method of hidden layer error gradient

Because there is only one hidden layer in the example, the error gradient is calculated as follows:

Hidden_delta = 1/(1-0.762) * (0.9294*0.1-0.0764*0.1-0.0764*0.1) = 0.1839

Now that the error gradient has been obtained, we can calculate the new weight value.

The calculation method is as follows:

Change = gradient * neuron Value

New_weight = old_weight + change *Coefficient

PS: Set the coefficient =0.5

Update the output weight of the Hidden Layer

Change1 = 0.9294*0.76 = 0.7063

Wo1 = original wo1 + Change1 *0.5 = 0.1 + 0.7063*0.5 = 0.4532

Change2 =-0.0764*0.76 =-0.0581

WO2 = original WO2 + Change2 *0.5 = 0.1-0.0571*0.5 = 0.0419

Similarly, wo3=0.0419

Update the input weight of the Hidden Layer

Change1 = 0.1839*1 = 0.1839

Wi1 = original wi1 + Change1 *0.5= 0.5 + 0.09195 = 0.5920

Similarly, wi2 = 0.5920

After the weight is updated, the neural network (with weight) in the example changes:

Figure 9 Neural Networks with updated weights

Now, if we re-enter word1 and word3, the neural network will respond as follows:

Figure 10 output values of neural networks with new weights

We can see that after training, the weight of the neural network is automatically adjusted to adapt to the sample state.

In reverse propagation, there is still an error in the book, that is, the definition of the dtanh function should be 1/(1-Y * Y) rather than 1-y * Y. The reason is as follows:

"The formulas to calculate output_deltas and hidden_deltas are wrong. according to the Delta rule (http://en.wikipedia.org/wiki/Delta_rule), either these formulas shocould be error divided by dtanh, or the dtanh itself shoshould be 1/(1-Y * Y )."

For more information about Delta rule, see Wikipedia. I am also learning.

 

The following table describes the error description of this book:

Http://oreilly.com/catalog/errataunconfirmed.csp? ISBN = 1, 9780596529321

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.