The maximum likelihood estimate is a wonderful thing, and I think the person who invented this estimate is particularly talented. If it were me, it would be hard to think of it.
The maximum likelihood estimation and Bayesian estimation represent the viewpoints of the frequency and Bayesian factions respectively. The frequency faction thinks, the parameter is the objective existence, only then is unknown. Therefore, the frequency faction most concerned about the maximum likelihood function, as long as the parameters are calculated, given the independent variable x, y is fixed, the maximum likelihood estimate is as follows:
On the contrary, the Bayes think that the parameters are also random, and the general random variable is not the essential difference, it is because the parameters can not be fixed, when given an input x, we can not use a definite y to indicate the output result, must be expressed in a probability way, so the Bayesian school predicted value is an expectation, as follows:
where x is the input, y is the output, D is the training data set, the model parameter
The formula is called full-Bayesian prediction. The question now is how to seek (posterior probability), according to the Bayesian formula we have:
Unfortunately, the above-mentioned posteriori probabilities are often difficult to calculate, because to integrate all the parameters, a typical closed solution (analytic solution) cannot be found. In this case, we use an approximate method to calculate the posterior probability, which is the maximum posteriori probability.
The maximal posterior probability and maximum likelihood estimation are similar, but there is a priori distribution, which embodies the view that the parameters are also random variables, and the transcendental distribution is usually given by the super-parameters in the actual operation.
It can be seen from the above that, on the one hand, the maximum likelihood estimation and the maximal posteriori probability are the point estimates of the parameters. In the frequency school, the parameters are fixed and the predicted values are fixed. The maximum posterior probability is an approximate method of Bayesian school, because the total Bayesian estimation is not necessarily feasible. On the other hand, the maximum posteriori probability can be regarded as a tradeoff between the transcendental and the MLE, if the data volume is large enough, the maximum posteriori probability and maximum likelihood estimation tend to be consistent, if the data is 0, the maximum posteriori is determined only by a priori.
Reference Link: http://blog.csdn.net/lzt1983/article/details/10131839
Maximum likelihood estimation and maximum posteriori probability map