International - English

Cart Console

Topic Center

Contact Sales

Home > Others

Application of maximum likelihood estimation (MLE) and maximum posteriori probability (MAP) in machine learning

Last Update:2016-09-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Maximum likelihood estimation MLE

Given a bunch of data, if we know that it was randomly taken out of a certain distribution, we don't know the specific parameter of the distribution, i.e. "the model is determined and the parameters are unknown".

For example, for linear regression, we assume that the sample is subject to a normal distribution, but we do not know the mean and variance, or for logistic regression, we assume that the sample is subject to two distributions, but we do not know the mean, and the logistic regression formula gets the probability of the dependent variable y p = g (x), x is the argument, A probability value is obtained by means of a logical function, y corresponds to a discrete value of 0 or 1,y obeys a two-item distribution, the error term is subject to two distributions, rather than a Gaussian distribution, so it is not possible to use the least squares model parameter estimation, and the maximum likelihood estimation can be used to estimate the parameters; Maximum likelihood estimation) can be used to estimate the parameters of the model. The goal of MLE is to find a set of parameters that make the model produce the highest probability of observing data:

This is the likelihood function, which indicates the probability of the observed data appearing under the parameter. We assume that each observation data is independent, then there are

In order to take the derivative convenient, generally to the target log. Therefore, the optimization of the likelihood function is equivalent to the optimal logarithmic likelihood function:

Give a simple example of tossing a coin. Now there is a positive and negative is not very symmetrical coin, if the face up to H, the aspect is marked as T, throw 10 times the result is as follows:

How likely is this coin to face up?

It's obvious that the probability is 0.2. Now we use the idea of MLE to solve it. We know that every coin toss is a two-item distribution, with the probability of facing up, then the likelihood function is:

X=1 said face up, x=0 the side upward. Then there are:

Derivation:

With a derivative of 0, it is easy to get:

That's 0.2.

Maximum posteriori probability MAP

The above MLE is to find a set of parameters that can make the likelihood function maximum, namely. Now the problem is a little bit more complicated, if this parameter has a priori probability? For example, in the case of tossing coins above, if our experience tells us that coins are generally symmetrical, that is, the probability of =0.5 is the most, the probability of =0.2 is relatively small, then how to estimate the parameters? This is the problem that map needs to consider. Map optimization is a posteriori probability, that is, given the observed value of the maximum probability:

The upper formula expands according to the Bayesian equation:

We can see that the first term is the likelihood function, and the second is the prior knowledge of the parameter. After taking the log is:

Back to the example of the coin toss, suppose that the parameter has a priori estimate, which obeys the beta distribution, namely:

And each coin toss is subject to two distributions:

The derivative of the objective function, then, is:

The first item of derivation has been given in the above Mle, and the second is:

The derivative is 0 and the solution is:

Which indicates the number of heads facing up. As you can see here, the difference between MLE and map is that the result of the map is more than a priori distributed parameter.

Supplemental Knowledge: Beta distribution

Beat distribution is a common prior distribution, its shape is controlled by two parameters, the domain is defined as [0,1]

The maximum value of the beta distribution is when x equals:

So in a coin toss, if the prior knowledge is that the coin is symmetrical, then let. But it is clear that even if they are equal, the value of both of them has an effect on the final result. The greater the value of the two, the less likely it is to deviate from symmetry:

Application of maximum likelihood estimation (MLE) and maximum posteriori probability (MAP) in machine learning

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

unix maximum number of files in directory maximum tint and sound best book for probability and statistics for machine learning best book to learn probability and statistics for machine learning maximum number of cookies maximum size of integer what maximum of function

OpenGL Series Tutorial Eight: OpenGL vertex buffer Object (VBO) 07-26

Methods for generating various waveform files Vcd,vpd,shm,fsdb 02-11

Mac Ping:sendto:Host is down Ping does not pass other people'... 09-01

Webmaster resources (site creation required) 12-07

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

What's Trending

Top 10 Tags

datastax versions naming convention zookeeper client class definition md5 microsoft sql server 2005 data structures exception handling error handling

Top 10 Keywords

microsoft download center down wordpress address url site address url wordpress address url windows installer 4 0 download 302 not found web address url definition site address url wordpress db2 integer mac os installation step by step pdf abbreviation for return

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Application of maximum likelihood estimation (MLE) and maximum posteriori probability (MAP) in machine learning

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support