Maximum likelihood estimation vs max posteriori probability estimation, logistic regression vs Bayes classification

Source: Internet
Author: User
Tags constant

In the course of learning Andrew Ng's machine learning, he thought that he had the maximum likelihood estimate, the maximum posterior probability estimate, and the logistic regression, the Bayesian classification of the shutdown was very clear, but after the school pattern recognition course, my outlook on life completely overturned .... Let me give you a word.

First of all, the concept of maximum likelihood (MLE) and maximum posterior probability (MAP), the two concepts represent the two factions of academia, frequency and Bayesian, the maximum likelihood estimation (frequency faction) that the unknown parameter in the model is unknown, but it is a constant (constant variable), and the maximum posterior probability (Bayesian faction) that the unknown parameter in the model is a random variable (random variable), is also subject to a certain distribution, but this distribution is not recognized by the people, which we need to solve.

Let's start with the whole reasoning, first do some foreshadowing, given a set of training sets, {x (i), Y (i))},i=1,..., m \left \{x^{(i)}, y^{(i)} \right \}, I=1,..., m, first we assume that any two training samples are independent of each other, Then all the reasoning can be established, and then began to describe the MLE and map respectively,

For MLE, the unknown parameter in the model is a constant, as mentioned above, and thus the probability of Y (i) y^{(i)} occurring in the case of X (i) x^{(i)} can be used P (Y (i) |x (i), θ) p\left (y^{(i)}|x^{(i)}; \ Theta \right) to say that the meaning of ";" is because Θ\theta is a constant, which is what ng means by "parameterized Byθ\theta", and then the training samples are independent of each other, thus the probability of the entire training set occurring, That is, the likelihood probability l (θ) L (\theta) is the product of these p (Y (i) |x (i), θ) p\left (y^{(i)}|x^{(i)}; \theta \right), which is the following formula,
L (θ) =∏i=1mp (y (i) |x (i), θ) l (\theta) = \prod_{i=1}^{m}p\left (y^{(i)}|x^{(i)}; \theta \right)
And how to determine the Θ\theta, because the training set this thing has happened, so that L (θ) L (\theta) the largest time of the Θ\theta is the real θ\theta, thus the L (θ) L (\theta) to the Θ\theta derivative, to obtain the derivatives of The Θ\theta of 0 is the Θ\theta we need to solve.

Here's a look at the map, which says the concept is based on the model

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.