MLE (maximum likelihood) vs. LS (least squares) and MAP (maximum posteriori)

Source: Internet
Author: User
MLE (maximum likelihood) vs. LS (least squares) and MAP (maximum posteriori) Preface

Maximum likelihood estimation is a common basic idea in machine learning, and many specific algorithms and models are based on it, or can be based on it to find explanations, such as:

The MLE can explain why the commonly used linear regression uses the square (i.e. least squares) instead of the four-time square

The relation and difference between MLE thought and MAP thought; this is about the frequency school vs. Bayesian School of Probability and statistics, and the understanding of regularization in machine learning; (MAP and Bayesian estimation, naive Bayesian classifier and even Logistic Regressio n LR are relevant, these other articles are discussed again)

The MLE idea, one of the ten algorithms used in machine learning, is an EM algorithm (expected maximization, K-means actually uses em;em other articles to start the discussion)

This article will elaborate the maximum likelihood of the idea, and discuss LS, MAP and maximum likelihood of the association. 1. MLE Maximum likelihood estimation

MLE (Maximum likelihood estimation)
Here is the first key question: What is the difference between likelihood and probability? Basically, likelihood refers to a reverse process, a known result to reverse the model or hypothesis, the result itself is meaningless, the proportion of different results is meaningful; probability refers to a forward process, known as a specific model parameter, to deduce the possibility of the result, the result itself has a probabilistic meaning. 1.1 Problem definition (applicable scenario) given a set of samples (data), they are all sampled from the same distribution (identically) , and each sample is independent (i.e., independent event, independently) We don't know the specific distribution, but we think it belongs to a distribution family , so we just need to determine the specific parameters, that is, "the model is determined, the parameters are unknown"

At this time, the maximum likelihood estimation can be used to estimate the model parameters, that is, to find a set of parameters, so that the probability of the model output observation data is greatest. For example, if we determine a distributed Gaussian distribution, our goal is simply to confirm its mean and variance.
(three-point hypothesis with a very strong maximum likelihood estimate in the three parts of the above definition) 1.2 likelihood function

After defining the problem, we use the likelihood function (likelihood) to quantitatively represent the probability of the model's output observation data , which can be understood as quantitative identification of conditional probability P (x|θ) p (X|\theta), where Θ\theta is the model parameter we want to estimate, and X x Is the data that has been observed. The likelihood function is precisely defined as follows:
L (θ;x1,x2,..., xn) =f (x1,x2,..., xn|θ) =∏i=1nf (xi|θ) L (\theta;x_1,x_2,..., x_n) =f (x_1,x_2,..., X_n|\theta) =\prod_{i=1 }^n{f (X_i|\theta)}

We express the likelihood by the probability density function f F of the model, e.g. the probability density function of the Gaussian distribution is F (x|θ) =12π2−−−√exp (− (x−μ) 22σ2) F (x|\theta) =\displaystyle\frac{1}{\ SQRT{2\PI^2}} \exp (-\frac{(X-\MU) ^2}{2\sigma^2}) since we assume that sampling is independent, we can split the joint probabilities based on all sampling into the product practice of N-n independent probabilities, often using logarithmic likelihood function, This is more convenient on some simplification, and the maximization is equivalent; called LOG-LIKELIHOOD:LN (L) =∑ni=1f (xi|θ) ln (l) =\sum_{i=1}^nf (X_i|\theta) 1.3 Maximum likelihood estimation

After defining the problem and determining the objective function (likelihood function), what we want to do is to maximize the objective function, that is, to find a set of model parameters with the largest probability of the model output observation data Θ^mle \HAT{\THETA}_{MLE}:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.