Neural networks used in machine learning IV notes

Source: Internet
Author: User

The fourth lecture of Professor Geoffery Hinton's Neuron Networks for machine learning mainly describes how to use the back propagation algorithm to learn the characteristic representation of a vocabulary.

Learning to predict the next word

The next few sections focus on how to use the back propagation algorithm to learn the feature representation of a vocabulary. Starting with a very simple example, we introduce the use of the back propagation algorithm to convert the relevant information between words into eigenvectors.

Given a family tree chart, all we have to do is let the neural network understand the information in the tree diagram and translate the information into a proposition, as shown in the second picture below.

The current relationship learning task is something that gets regular from a similar ternary relationship in a tree view, a typical representation method, as shown in the red font. It is difficult to search this classic pattern from a tree, because the search space is discrete. A different approach is to use a neural network to search for a continuous real field of weights, thereby attempting to extract a similar relationship representation from a tree view.

If the neural network can predict the third information element according to the first two information elements of the ternary information, we say that the neural network can extract the information from the tree graph. is one of the neural networks, the bottom is the input, the top is the output. At the bottom, we enter a person name P1 and a relationship R; At the top, the output is a neural network found with P1 has a relationship with R's name P2.

What we need to do now is to encode information in a neutral (non-emotional) way. Because there are 24 people in the tree Chart of family relationships given earlier, there are 24 neuron at the bottom of the flowchart, and each person one corresponds to a 24 person. Similarly there should be 12 neuron that correspond to 12 different relationships and should have a unique output for a given person one and relationship neural network. Of course, for example, the figure does not give the Christopher Mother, the neural network gives the answer must be wrong.

Here's a little bit of testing in the video, which I think is worth thinking about.

We use coding methods like small tests to minimize the similarity between characters caused by coding problems, so that the neural network should not get the relationship information implied by improper coding (we ' re not cheating by giving the network Information about the WHO's like WHO). That is, for a neural network, character coding is just a set of symbols that do not have any meaning.

On the second level of the neural network, we've got the local encoding of person one and then linked it to a subset of 24 neuron, in this case the size of this set is 6 (a person has a maximum of 6 relationships), The neural network needs to re-represent the person one for these 6 neuron. Is the information of the neural network (how to get the following course will be introduced), with 24-dimensional two-dollar vector to represent each person, the following gives 6 unit, the above line represents the British, the following line represents the Italians. Careful observation found that the first line on the right side is all positive (black), the second line is all negative (white), indicating that the 12 are all British; the second on the right learns the generational, and generational the tallest of them all correspond to the medium-sized blocks, Generational the second person all corresponds to the smallest block, generational the smallest person all corresponding to the largest block, the left of the last to learn the branch, marked as negative (white) people are all in the tree chart has branches, marked as positive (black) people are all in the left branch of the tree chart. (This is the same as in the group of people, the Italians are the same) can be seen, the neural network automatically excavated from the tree image to some of the hidden information.

Here are two images that tell us what the neural network learns.

Suggestions are given for large-scale problems.

This section does not understand, hope to learn to share your understanding of the students.

A brief diversion into cognitive scienceanother diversion:the Softmax output functionneuron-probabilistic language models Ways to deal with the large number of possible outputs

Neural networks used in machine learning IV notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.