CS224D Lecture 9 Notes

Last Update:2015-08-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Welcome reprint, Reprint annotated Source:

http://blog.csdn.net/neighborhoodguo/article/details/47193885

The contents of the recent lessons are not very difficult, and I have improved my comprehension (narcissism), so these lessons have been completed very quickly. Unconsciously LEC9 also completed. This tells the other rnn, where R is recursive is not the previous recurrent. Class teacher use recursive NN to do NLP and CV task, I personally think to do CV is good, this NLP how feel a little unreliable. Anyway, this model solves a lot of practical problems, and the performance is good, now to record it.

Let's start by combing through this lesson. First of all, how to put a sentence vector, then how to do parsing, then the method of building the object function Max-margin and bpts (backpropagation Through Structure), and finally a few improved versions of recursive NN and this model can also be computer vision work.

1.Semantic Vector Space for sentence

Similar to the previous stage of word vector space this time we are projecting an entire sentence into the semantic vector spaces. Our model is based on two assumptions: the meaning of a sentence is based on 1. The meaning of the word contained in this sentence; 2. How the sentence is constructed. The 2nd is still in the debate, our discussion model can complete two tasks at the same time, the first can learn the sentence of the tree model, the second can learn the sentence in the semantic vector space in the expression.

What is parsing tree? ：

The above figure is the parsing tree described in this lecture, and the previous recurrent neural networks is actually similar to the following parsing tree, which is seen as a special representation of the previous parsing tree.

Which of these two representations is correct is not conclusive now (still cognitively debatable)

How do you learn this parsing tree? Clever man invented a method called Beam search is bottom-up method, from the lowest beginning, calculate which two become good score the biggest, then take out the largest score two node then they merge (Good evil). The last until the top of the list is all in the form of a parsing tree.

2.objection function? Max-margin Framework

Objection function in slide I later see the object function in Recommand reading to find that the sign is reversed. I guess it's not when the teacher wrote it back.

The object function is given in this paper, where Delta (Yi, y_hat) is multiplied by a k by the number of node markers that are wrong:

There are two parts of the score:

The first half of the V is to be learned through our model, and the second half is the probability that the log probability of the pcfg is the thing that happens and turns into log space.

Lecture Max-margin Not too detailed, the second paper is very good, here excerpt:

Finally get the formula of Max-margin. Our aim is to make the C (W) minimum

So why is the optimal, I thought for a long while to come up here with the popular point of the word record: If W is not the best w then Max () in the left of the score selected is not y_i, plus l_i so the final must be RI very large, is not the smallest, if w is optimal? It is certain that Max () has chosen Yi,delta to be zero, and then the total must be minimal. Such a w must make score (y_i) larger than all the other score (y) and a margin of l_i (y).

3.BPTS

Bpts in the paper is relatively small, slide is quite detailed and Pset2 part of the code is good drops. There are three differences between bpts and the previous traditional BP:

The 1th is to ask W's gradient to sum all of the node's; 2nd I feel is used to update the vector in semantic vector space; 3rd adds an error message:total error messages = Error messages from parent + error message from own score

The improved method of bpts parameters update can be adjusted learning rate or use Subgradient (using Subgradient method in the paper, cs229 also have a SMO method comparable)

Improved version of 4.Recursive nn

The first half of the paragraph is the simplest simple RNN. Finally, a modified version of the SU-RNN (syntactically-untied RNN)

That is, weight different choices based on the type of children.

Finally there is a CV display, which means that RNN for NLP operations and CVS is almost a step-by-step decomposition.

Website:

nlp.stanford.edu

Http://repository.cmu.edu/robotics

www.socher.org

CS224D Lecture 9 Notes

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

CS224D Lecture 9 Notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

CS224D Lecture 9 Notes

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support