Maximum likelihood estimation (MLE) and the application of maximum posterior probability (MAP) in machine learning _ machine learning

Source: Internet
Author: User

Maximum likelihood estimation MLE

Given a bunch of data, if we know it is randomly taken out of a distribution, we do not know the specific parameters of the distribution, that is, "the model has been set, the parameter is unknown."

For example, for linear regression, we assume that the sample is subject to a normal distribution, but does not know the mean and variance, or for logistic regression, we assume that the sample is subject to two-item distribution, but does not know the mean, the logical regression formula gets the probability p = g (x) of the variable y, and X is the Independent To get a probability value through a logical function, y corresponding to the discrete value of 0 or 1,y two distribution, the error is subject to two distribution, rather than the Gaussian distribution, so the least squares can not be used to estimate the model parameters, the maximum likelihood estimation is used to estimate the parameters; Maximum likelihood estimation) can be used to estimate the parameters of the model. The goal of MLE is to find a set of parameters that allow the model to produce the maximum probability of observational data:

One is the likelihood function, which indicates the probability of observing the data under the parameter. We assume that each observation data is independent, so there are

In order to take the derivation conveniently, the target log is generally taken. So the optimization of the likelihood function is equivalent to the optimal logarithmic likelihood function:

Give a simple example of a coin toss. Now there is a coin that is not very symmetrical, if the front is marked H, the aspect is recorded as T, the result is 10 times as follows:

Ask for the probability of the coin facing up.

Obviously the probability is 0.2. Now we're going to solve it with MLE's mind. We know that every coin toss is a two-item distribution, and the probability of a front facing up is, then the likelihood function is:

X=1 said face up, X=0 said side upward. Then there are:

Derivation:

So that the derivative is 0, it's easy to get:

which is 0.2.



maximum posterior probability MAP

The above MLE is to find a set of parameters that can make the likelihood function the largest, that is. Now the problem is a little bit more complicated, if this parameter has a priori probability. For example, in the case of flip coins above, if our experience tells us, coins are generally symmetrical, that is, the =0.5 is the most likely, the possibility of =0.2 is relatively small, then how to estimate the parameters? This is the problem that map should consider. Map optimization is a posteriori probability, that is, given the observed value of the maximum probability:

The upper formula is expanded according to the Bayesian formula:

We can see that the first one is the likelihood function, and the second is the priori knowledge of the parameters. After the log is taken:

Back to the flip-coin example, suppose the parameter has a priori estimate that obeys the beta distribution, namely:

And each coin toss is subject to two distribution:

Then the derivative of the objective function is:

The first item of derivation has been given in the mle above, and the second is:

So that the derivative is 0, the solution is:

which represents the number of faces facing up. In this view, the difference between MLE and map is that the results of the map are more than some prior distribution parameters.

Supplemental Knowledge: Beta distribution

Beat distribution is a common prior distribution, its shape is controlled by two parameters, and the domain is defined as [0,1]

The maximum value for the beta distribution is x equals:

So in a coin toss, if a priori knowledge is that the coin is symmetrical, then let it be. But it is clear that even if they are equal, the value of both will have an effect on the end result. The greater the value of two, the less likely it is to deviate from symmetry:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.