The first course of natural language processing at Stanford University-Introduction (Introduction)

Source: Internet
Author: User
Tags knowledge base

I. Introduction of the Course

Stanford University launched an online natural language processing course in Coursera in March 2012, taught by the NLP field Daniel Dan Jurafsky and Chirs Manning:
https://class.coursera.org/nlp/

The following is the course of the study notes, to the main course ppt/pdf, supplemented by other reference materials, into the personal development, annotation, and welcome everyone in the "I love the public class" on the study together.

Courseware Summary: The Stanford University Natural Language Processing public Course Courseware summary

Ii. Overview of Natural language processing-what is natural language processing (NLP)

1) Related Technologies and applications

    • Auto Question and Answer (Question Answering,qa): It is a set of computing systems that can understand complex problems and give answers with full accuracy, reliability and speed, represented by IBM's Waston;
    • Information Extraction (information Extraction,ie): The purpose is to transform unstructured or semi-structured natural language descriptive text into structured data, such as automatically generating calendar based on the content of the message;
    • affective Analysis (sentiment Analysis,sa): Also known as bias analysis and opinion mining, it is a subjective text with emotional color analysis, processing, induction and reasoning process, such as from a large number of web pages to analyze the user's "digital Camera" "Zoom, Price, The emotional inclination of the attributes of size, weight, sparkle, ease of use, etc.;
    • Machine TRANSLATION,MT: Converts text from one language to another, such as a Chinese-English machine translation.
    • ... ...
2) Development status
    • Basic Solution: POS tagging, named entity recognition, spam recognition
    • make great strides: affective analysis, common finger digestion, word sense disambiguation, syntactic analysis, machine translation, information extraction
    • Challenges: Auto-quiz, retelling, Digest, session Bots

3) Main difficulties of NLP--ambiguity
    • ? Lexical analysis Ambiguity
      • Participle, such as "strict adherence to the number one organs," the possible results of the word "strict adherence to//mobile/off/off" and "strict/number one/institutions/the"
      • POS tagging, such as "plan" has different parts of speech in different contexts: "I/plan/V/R/" and "Me/completed/out/planned/n"
    • Grammatical analysis ambiguity
      • ?“ The Wolf bit the hunter's dog.
      • "The dog that killed the hunter is missing."
    • Semantic Analysis Ambiguity
      • Machine translation: The sentence "At last, a computer this understands you like your mother" can have multiple meanings, as follows:
        • The computer will be as good as your mother to understand you (the language)
        • Computers understand that you like your mother
        • The computer will understand you like a good understanding of your mother.
    • Ambiguity in the application of NLP
      • Voice Word conversion: Pinyin string "Ji Qi fan yi ji qi ying Yong ji qi le ren men ji qi nong Hou de xing qu" in "Ji Qi" how to convert to the correct entry
4) Why is natural language understanding so difficult?
    • There are many non-standard language descriptions such as colloquial, idiom and dialect in user generated content.
    • Word segmentation problem
    • New words continue to produce
    • Basic knowledge and context
    • A variety of solid words
    • ... ...

In order to solve the above problems, we need to master more linguistic knowledge, build a knowledge base resource, and find a way to fuse various knowledge and resources, and now use more probability models (probabilistic model) or statistical models (statistical model). or "Empirical model", its modeling process is based on a large-scale real corpus, from the language units at all levels of statistical information, and based on the lower-level language unit statistics, run related statistics, reasoning and other techniques to calculate the higher level of the statistical information on the language unit. Its relative "idealistic model", which is a deterministic language model based on Chomsky formal language, is based on the hypothesis that the innate grammatical rules exist in the human brain, and that language is derived from the language ability of the human brain, The establishment of language model is to simulate this innate language ability by establishing a set of manually edited language rules.


This course focuses on statistical-based NLP techniques such as Viterbi, Bayes and Max Entropy classifiers, N-gram language models, and so on.

Iii. references

    1. Lecture slides:introduction
    2. http://en.wikipedia.org
    3. Guan Yi, basic Course in statistical natural language processing PPT
    4. Zhao Yan Research, text sentiment analysis summary
    5. Liu Qun, Wang Haifeng, Wang Huilin, Zongchengqing, Chi Tiejun, Yidong, Ju Jingpo, Chen Jiajun, Zhang Min, the progress and prospect of machine translation technology, the 30 anniversary of the founding of the Chinese Information Society, December 2011 4-5th, Beijing

Reprinted from: I Love Open class

The first course of natural language processing at Stanford University-Introduction (Introduction)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.