Maximum likelihood estimation __ machine learning

Source: Internet
Author: User

Conversion from: http://blog.csdn.net/zengxiantao1994/article/details/72787849 Maximum likelihood estimation

Many times before the maximum likelihood estimates, but have been not quite clear what the principle, the recent look at Bayesian classification, the maximum likelihood estimates have a new understanding, summed up as follows:


Bayesian decision

First look at the Bayesian classification, we all know the classical Bayesian formula:


Where: P (w): For a priori probability, the probability that each category distributes; class conditional probability, which indicates the probability of something happening in the context of a certain category, and for a posteriori probability that something has happened and that it belongs to a certain class of probabilities, with this posteriori probability, we can classify the samples. The greater the probability of a posteriori, the greater the likelihood that something falls into this category, the more reason we are to classify it under this category.

Let's look at an intuitive example: known: in summer, the probability of men wearing sandals in a park is 1/2, the probability of women wearing sandals is 2/3, and the park's male and female ratio is usually 2:1, question: If you randomly meet a man in the park wearing sandals, What is the probability of his sex being male or female?

From the point of view, it is said above, something happened, it belongs to a certain kind of probability is how much. That is, the posterior probability.

Set

To be known by:

Men and women wear sandals independently of each other, so

(If only consider the classification problem, only need to compare the size of posterior probability, the value is not important).

The Bayesian formula calculates:


problem leads

But not all of the actual problems are so fortunate that we can obtain only a limited number of sample data, and the prior probability and the class conditional probability (the overall distribution of the various) are unknown. According to the classification of the only sample data, a feasible method is that we need to estimate the prior probability and the class conditional probability before applying the Bayesian classifier.

The estimation of prior probabilities is simpler, 1, the natural state of each sample is known (supervised learning), 2, relying on experience, 3, the frequency of each type of training sample is estimated.

The estimation of the class conditional probability (very difficult), the reason includes: the probability density function contains all the information of a random variable, the sample data may not be many, the dimension of Eigenvector x may be very large and so on. In short, it is difficult to directly estimate the density function of the class conditional probability. The solution is to convert the probability density, which is completely unknown, into an estimated parameter. In this paper, the probability density estimation problem is transformed into the parameter estimation problem, and the maximum likelihood estimation is a parameter estimation method. Of course, the selection of the probability density function is very important, the model is correct, in the sample region infinite, we will get a more accurate estimate, if the model is wrong, the estimated half-day parameters, certainly no meaning.


Important premise

As mentioned above, the problem of parameter estimation is only a simplified method in solving the real problem (because it is very difficult to estimate the class conditional probability density function directly). Therefore, a sample that can use the maximum likelihood estimation method must satisfy some prerequisite assumptions.

Important premise: The distribution of training samples can represent the real distribution of samples. The samples in each sample set are called independent identical distribution random variables (IID conditions) and have sufficient training samples.


Maximum likelihood estimation

The principle of maximum likelihood estimation is illustrated by a picture, as shown in the following illustration:


To sum up, the purpose of the maximum likelihood estimate is to use a known sample result to reverse the parameter values that are most likely to cause such a result.

Principle: Maximum likelihood estimation is a statistical method based on the maximum likelihood principle, which is the application of probability theory in statistics. Maximum likelihood estimation provides a method for evaluating model parameters by a given observation data, namely: "The model has been set, the parameters are unknown." It is called maximum likelihood estimation that the probability of the sample appearing is maximal by using the test results to observe the results of several experiments.

Since the samples in the sample set are independent and homogeneous, we can estimate the parameter vector θ by considering only a class of sample set D. The set of known samples is:


Likelihood functions (Linkehood function): The joint probability density function is called the likelihood function of the relative theta.


If the maximum theta value of the likelihood function in the parameter space is the "most probable" parameter value, then it is the maximum likelihood estimator of θ. It is the function of the sample set, which is recorded as:



Solving the maximum likelihood function

ML estimate: The theta value that maximizes the probability that the group sample appears.


In practice, logarithmic likelihood functions are defined for ease of analysis:



1. There is only one unknown parameter (theta is a scalar)

The maximum likelihood estimator is the solution of the following differential equation when the likelihood function satisfies the continuous and differentiable regular conditions:


2. Unknown parameters have multiple (θ is vector)

Then Theta can be expressed as an unknown vector with a component of S:


Note Gradient operator:


If the likelihood function satisfies the condition of continuous conductivity, the maximum likelihood estimator is the solution of the following equation.


The solution of the equation is only an estimate, and it is close to the real value only when the sample number tends to be infinitely many.


Examples of maximum likelihood estimates

Example 1: Set the sample to obey the normal distribution, then the likelihood function is:


logarithm of it:


Derivation, Equation Group:


Joint solution:


The likelihood equation has a unique solution: and it must be the maximum point, because when or when the nonnegative function is. So the maximum likelihood of u and is estimated as.


Example 2: Set the sample to be uniformly distributed [A, b]. The probability density function of x:


For sample:


Obviously, L (A,b) as A and b of the two-yuan function is discontinuous, then can not be solved with the derivative. The maximum value of L (A,B) must be obtained from the definition of maximum likelihood estimate, so that L (a,b) is maximized, b-a should be as small as possible, but B can not be less than, otherwise, l (a,b) = 0. Similarly a cannot be greater than, therefore, the maximum likelihood estimate of A and B:



Summary

General steps for maximum likelihood estimator:

(1) Write the likelihood function;

(2) Taking logarithm of likelihood function and arranging it;

(3) derivation number;

(4) Solving the likelihood equation.

The characteristics of maximum likelihood estimation:

1. Simpler than other methods of estimation;

2. Convergence: Unbiased or asymptotically unbiased, when the number of samples increases, the convergence of the property will be better;

3. If the hypothetical class conditional probability model is correct, a better result is usually obtained. However, if the assumption model is biased, it will result in very poor estimates.


Matlab example of the normal distribution ml estimate: Click on open link

        </div>

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.