CS224D Lecture 16 Notes

Source: Internet
Author: User
Tags stanford nlp

Welcome reprint, Reprint annotated Source:

Http://www.cnblogs.com/NeighborhoodGuo/p/4728185.html

Finally the last talk also finished, Stanford NLP Course also close to the end, really is very happy, this course really let me harvest a lot.

This lesson is the DL in the application of NLP, in fact, most of the content in the previous class and before the recommended reading has been mentioned, this lesson is a refresher lesson.

The same first overview a bit: 1. Model Overview 2.Character Rnns on text and code 3.Morphology 4.Logic 5.q&a 6.image-sentence Mapping

Model Overview

The teacher strongly recommended glove in class.

The dimensions of Word vectors often determine the number of parameters for a model

Phrase Vector composition is mainly represented by averaging, Recursive neural networks, convolutional neural networks, recurrent neural NE Twork

Many of these recursive functions are MV-RNN variants.

The parsing tree is divided into three main types: the first is that constituency tree has an advantage in capturing syntactic structure. The second is that depenency tree has an advantage in capturing semantic structure.

The third is the balanced tree, which is very similar to CNN

There are three types of Objective function: The first is the Max-margin second is the cross-entropy third is auto-encoder (this kind of use of NLP is not clear, so in class did not say)

Optimization is divided into two main categories: the first category is optimization algorithm, SGD, SGD + momentum, L-BFGS, Adagrad, Adelta

The second major category is optimization tricks, regularization, dropout

Morphology

Some words in English have a standard root, and this root can also derive many derivative words. In some cases, the lexical frequency of the root is particularly high, and the frequency of derivative words appears relatively low.

This results in a more accurate representation of the root of the model, but the representation of the derived term is blurred.

Therefore, based on this problem, there is a kind of improvement measures for the model.

For derived terms it is parsing to generate a root-based tree and then use the root and prefix suffix, which links the derivation with the root.

Logic

The main purpose is to identify the following content:

The model used is still RNN, and the model is more similar than that.

Q&a

This part of the lesson comes from a previous recommendation paper that the computer will be able to talk to people later, which is the more complete version of Siri on Apple's phone.

There is the computer can participate in similar to the Lucky 52, Happy Dictionary program, but also can do better than people. This is going to be awesome.

Image-sentence Mapping

The latest research results are made by Professor Li Feifei. It means that a picture is put over, and the computer can describe the contents of the image.

The simple method is to project the picture and the sentence into the same vector space, and when a picture appears, apply the Euclidean distance to find the appropriate description for the last few sentences. Conversely, you can do image retrieval.

However, the resulting sentences are limited sentences, the computer can not "describe", so there is an improved version.

First, use the CNN model to project the image into a vector, and then use LSTM to generate the sentence. This is a bit like a machine translation, just replaced the source language with an image

Finally, the evaluation method for this model (evaluation) is named mean rank

When you create a picture for a sentence, you may generate many sentences, some of which are incorrect.

Put all the sentences generated in the order of relevance from the big to the small, and then record the correct sentence of rank to ask mean is mean rank

Of course the smaller the better!

CS224D Lecture 16 Notes

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.