The language model of "Nlp_stanford classroom"

Source: Internet
Author: User

First, the language model

Designed to: Calculate a joint probability for a sentence or a group of words

Role:

    • Machine translation: Used to differentiate translation results
    • Spelling correction: A misspelled word is more likely to be the word, so the correction
    • Speech recognition: The probability of speech recognition coming out of this sentence is greater
    • Summary or question answering system

Related tasks: On the basis of the original sentence, the conditional probability of a new word is computed, and the probability is closely related to P (W1W2W3W4W5).

Any one model calculates the above two probabilities, which we all call a language model LM.

Second, how to calculate the probability

Method: Chain rule of dependence probability

Thus there are:

Question: How to estimate these probabilities

Method One: Counting and subdivision

But it can't be done!

Reason: The number of sentences is too large; it is never possible to have enough data to estimate these (corpora can never be complete)

Method Two: Markov hypothesis

Or:

That

So:

Three, Markov model

1. Unigram model

Its hypothetical words are independent of each other

2. Bigram model

3. N-gram Models

But not effective because the language itself has long-distance dependencies

For example, "the computer which ... crashed" word crash itself is actually dependent on the subject computer, but in the interval of a very long clause, in the Markov model it is difficult to find such a dependency

But in practical application, it is found that n-gram can solve this problem to some extent.

The language model of "Nlp_stanford classroom"

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.