The 1.EM algorithm is an iterative algorithm for probabilistic model maximum likelihood estimation or maximal posterior probability estimation of variables with implicit variables, and the data representation of probabilistic models with implicit variables is $p (Y,z|\theta) $. Here, $Y $ is the data of the observation variable, $Z $ is the data of the implicit variable, $\theta$ is a model parameter. The EM algorithm achieves maximum likelihood estimation by iterating the maximum of the logarithmic likelihood function (\theta) =LOGP (Y|\theta) $ of the observed data. Each iteration consists of two steps: E-step, the expectation of seeking $logp (Y|\theta) $ for $p (y|\theta^{(i)}) $:
$Q (\theta,\theta^{(i)}) =\sum_{z}logp (Y,z|\theta) P (z| y,\theta^{(i)}) $
Called the Q function, here $\theta^{(i)}$ is the present estimate of the parameter, M-step, which is maximal, that is, the maximum Q function gets the new estimate of the parameter: $\theta^{(i+1)}=arg~max_{\theta}q (\theta,\theta^{(i)}) $
When constructing the specific EM algorithm, it is important to define the Q function, in each iteration, the EM algorithm increases the logarithmic likelihood function (\theta) $ by the maximum Q function.
2.EM algorithm increases the likelihood function value of the observed data after each iteration, namely: $P (y|\theta^{(i+1)}) \geq P (y|\theta^{(i)}) $
In general, EM algorithm is convergent, but it is not guaranteed to converge to the global optimal.
3.EM algorithm is widely used in the study of probabilistic models with implicit variables, the parameter estimation of Gaussian mixture model is an important application of EM algorithm, the next chapter mainly introduces the non-supervised learning of hidden Markov model is also an important application of EM emission.
The 4.EM algorithm can also be interpreted as the maximal-maximal algorithm of the F-function, the EM algorithm has many transformations, such as the Gem algorithm, the GEM algorithm is characterized by increasing the value of the F function each iteration, thereby increasing the likelihood function value.
EM algorithm and the key points of its popularization