Use a neural network to create a page that responds to a search keyword
For a search engine, each user can click only a search result, instead of other content, to provide the engine with information about his or her preferences for the search results in a timely manner.
Therefore, we can construct a neural network to provide the network with the words to be queried, the search results returned to users, and the click decisions of users, and then train the neural network.
Figure 1 multi-layer Sensor
A network composed of multiple layers of neurons is called a multilayer perception (MLP ).
Why do we need to use such a complex neural network instead of simply recording the query conditions and the number of times each search result is clicked?
Because the power of a neural network is that it can give reasonable guesses about the query results that have never been seen before based on similarity with other query results.
To implement this neural network, we need to create several tables.
① Ing table between word and URL
② Hiddennode table
③ Word2hiddennode table
④ Hiddennode2url table
Feed-forward method
The feed-forward method refers to positive network propagation.
For example, we have a list of three words and a list of three urls:
Figure 2 Input and Output lists
Now we enter word1 and word3, which means that the input nodes word1 and word3 are activated and their output is 1. Now, we create a new hidden layer node (assuming that a nonexistent word query is not executed, a new hidden layer node is generated). The input of the hidden layer node is the output of the input layer node:
Figure 3 input activated
In addition to the input from the previous node, an important parameter for each neuron connection isWeight. Here, we assume that the word2hiddennode weight is 1/Len (words), and The hiddennode2url weight is 0.1, so there are:
Figure 4 weighted neuron connection
Now, our task is to determine the three "?" What is the corresponding value. Obviously, they are output values of hiddennode. The output value of hiddennode is calculated as follows:
Formula 1 Calculation of output values of neurons
In layman's terms, the value of each input connection is multiplied by the weight of the connection. After the sum, a tanh function (such as) is used to obtain the result.
Figure 5 Tanh Functions
In this example, value = Tanh (1*0.5 + 1*0.5) = Tanh (1) = 0.76
Now, we know "?" in Figure 4. Is 0.76. Next, we can calculate the output values of each URL.
The calculation method is also Formula 1, so
Url1_output = Tanh (0.76*0.1) = Tanh (0.076) = 0.076
Similarly, url2_output = url3_output = 0.076
The above is the calculation process of the entire feed-forward method. The key is the setting of the weight of the neuron connection and the response function after the sum (tanh function is used in this example ). The response function is used to indicate the response degree of each node to the input. Generally, the S-type function (Tanh is a type of S-type function) is selected ).
Now, through the feed-forward method, we can see that after the network is stimulated by word1 and word3, the response and output of the entire network are as follows:
Figure 6 Feed-forward method output
Maybe you may wonder why the output is the same? When the output is the same, how do we know which URL should be placed at the top? Don't worry. This is because the network has not been trained. I don't know which URL should be output under a combination of word1 and word3. We can use some samples to train it so that it can adjust the weights of connections between neurons to differentiate URLs (which is highly relevant and which is less relevant).
PS: There is a printing error in this part of the book, that is, the output of the feed-forward method should be (0.076, 0.076, 0.076), instead of (0.76, 0.76, 0.76 )...... It was a printing error.
Reverse Propagation
Through the introduction of the feed-forward method, we know that if we do not tell the neural network what is correct and what is wrong, we can simply use the computing power of the network, it is insufficient to distinguish the differences between outputs. The interesting thing about a neural network is that it can be trained by adjusting the weights of connections between neurons, this allows you to make similar choices for unknown input and training samples (training samples are used to train weights ).
Now let's try a reverse propagation to adjust the weights of each connection in the neural network.
Let's review section 6. Assume that the output of url1 is correct (that is, url1 is expected when the combination of word1 and word3 is input). Then we can see that our goal is url1, that is, target [url1] = 1, and other targe [urlx] = 0, such:
Figure 7 provides a training sample for the neural network
Through the training sample, we can calculate the error size. The formula for calculating the error is:
Error = target-url_output
Formula 2 formula for calculating the output layer error
There are:
Figure 8 error calculation at the output layer
After knowing the output error of each URL, we can calculate the error gradient of the output values of each neuron layer by layer.
Error gradient calculation of the output layerFormula, where k is the K element of the output layer:
Formula 3 formula for calculating the error gradient at the output layer
In this way, we can calculate the error gradient values of the three URL outputs:
Output_delta [1] = 1/(1-0.0762) * 0.924 = 0.9294
Output_delta [2] = 1/(1-0.0762) * (-0.076) =-0.0764
Output_delta [3] = 1/(1-0.0762) * (-0.076) =-0.0764
After calculating the error gradient of the output layer, we can calculate it.Hidden Layer error gradient, The pseudo Calculation MethodCodeYes:
Hidden_deltas = [1, 0.0] *Len (hiddenids)ForJ range (LEN (hiddenids): Error= 0.0ForKInRange (LEN (url_ids): Error= Error + output_delta [k] *Weight_of_output [J] [k] hidden_delta [J]= (1/(1-hiddennode_value [J]) * Error
That is
Formula 4 Calculation Method of hidden layer error gradient
Because there is only one hidden layer in the example, the error gradient is calculated as follows:
Hidden_delta = 1/(1-0.762) * (0.9294*0.1-0.0764*0.1-0.0764*0.1) = 0.1839
Now that the error gradient has been obtained, we can calculate the new weight value.
The calculation method is as follows:
Change = gradient * neuron Value
New_weight = old_weight + change *Coefficient
PS: Set the coefficient =0.5
Update the output weight of the Hidden Layer
Change1 = 0.9294*0.76 = 0.7063
Wo1 = original wo1 + Change1 *0.5 = 0.1 + 0.7063*0.5 = 0.4532
Change2 =-0.0764*0.76 =-0.0581
WO2 = original WO2 + Change2 *0.5 = 0.1-0.0571*0.5 = 0.0419
Similarly, wo3=0.0419
Update the input weight of the Hidden Layer
Change1 = 0.1839*1 = 0.1839
Wi1 = original wi1 + Change1 *0.5= 0.5 + 0.09195 = 0.5920
Similarly, wi2 = 0.5920
After the weight is updated, the neural network (with weight) in the example changes:
Figure 9 Neural Networks with updated weights
Now, if we re-enter word1 and word3, the neural network will respond as follows:
Figure 10 output values of neural networks with new weights
We can see that after training, the weight of the neural network is automatically adjusted to adapt to the sample state.
In reverse propagation, there is still an error in the book, that is, the definition of the dtanh function should be 1/(1-Y * Y) rather than 1-y * Y. The reason is as follows:
"The formulas to calculate output_deltas and hidden_deltas are wrong. according to the Delta rule (http://en.wikipedia.org/wiki/Delta_rule), either these formulas shocould be error divided by dtanh, or the dtanh itself shoshould be 1/(1-Y * Y )."
For more information about Delta rule, see Wikipedia. I am also learning.
The following table describes the error description of this book:
Http://oreilly.com/catalog/errataunconfirmed.csp? ISBN = 1, 9780596529321