1. Notes
Maximum likelihood estimation (Maximum likelihood estimation, ML) is an algorithm for evaluating model parameters in the given observation data. It belongs to a statistical method, which is used to find the parameters of the relative probability density function of a sample set.
For example:
To count the height of the whole school, we know that the height obeys the normal distribution (the model has been fixed), but the distribution mean and variance are unknown (parameter is unknown). 1.1 Algorithm Concepts:
" " model is set, parameter unknown ""
Given: the model (all or part of the parameter is unknown), the sample set.
Estimate: Unknown parameter of the model. 1.2 Core ideas:
The model parameters we estimate are the most likely to produce this given sample. 1.3 Algorithm Prerequisites (must be met):
assume that all samples are subject to independent distribution (independent and identically distributed, I.i.d.).
For example, in the example above, let's say we're going to randomly take 50 samples, the students, to measure height as a sample set. Here the sample extraction is to satisfy the independent stochastic concept. 2. Likelihood definition 2.1 likelihood function:
In the maximum likelihood estimation, we try to find the best parameters under the given model, which makes this group of samples appear most likely. The likelihood function is defined as follows:
L (θ|x1,x2,..., xn) =f (x1,... xn|θ) =f (x1|θ) ⋅f (x2|θ) ⋅ ... ⋅f (xn|θ). L (\theta |x_1,x_2,..., x_n) = f (x_1,... X_n|\theta) =f (X_1|\theta) \cdot f (x_2|\theta) \cdot ... \cdot f (x_n|\theta).
which
X1,..., xn x_1,..., x_n for independent and distributed sample sampling, namely DataSet, F F is used model, Θ\theta is model parameter.
The likelihood function is used to measure the probability of producing this set of samples from the model, so the core of how to list the likelihood function is that the formula can express the probability of the occurrence of the sample and contains the unknown parameters of the model. Of course, I said a bit like nonsense, in fact, to find their own list of likelihood function or to be a lot of effort, need to find the nature of the problem probability list.
2.2 Logarithmic likelihood:
LNL (θ|x1,..., xn) =∑i=1nlnf (xi|θ) ln L (\theta |x_1,..., x_n) =\sum_{i=1}^n ln f (x_i|\theta)
But in practical application, it is seldom expressed by the model function, most of which is defined by the probability form. As follows:
LNL (θ|x1,... xn) =∏i=1np (xi;θ) ln L (\theta |x_1,... x_n) =\prod_{i=1}^n P (X_i;\theta)
The average logarithm seems to be:
l^=1n