In the beauty of mathematics, Wu that the expected maximum value algorithm is "God's algorithm", the application of EM algorithm in Gaussian mixture distribution is discussed below.
Discovering of Gaussian discriminant analysis and Gaussian mixture distribution (first episode) The last remaining questions continue to be discussed, the following is not about what the hidden factors, but simple mathematical calculus, the process is very interesting.
The following seminar uses a few of the knowledge, we first as a supplementary knowledge, say :
Supplementary Knowledge Point one: the definition of the concave function:
set is defined on the function, if arbitrary and arbitrary, if the following conditions are met,
,
The function is called a concave function on the top.
Supplementary Knowledge Point two: If the second derivative of the function satisfies, then the function is a concave function. is obviously a concave function.
Supplementary Knowledge Point three: Jensen inequality:
If the function is a concave function on the upper, then it is arbitrary, and satisfies
.
___________________________________________________________________________________________
Set data set
is a sample point for the following Gaussian mixture distributions:
We then give the logarithm of the likelihood function of the model:
The following is the deformation of the logarithmic likelihood function above and then the Gaussian mixture distribution:
,
We're just multiplying by one, then dividing by one, so the equals sign is set up, where on any, the following conditions are met:
(note: Because in the denominator, we consider the situation first.) When we draw the EM algorithm back, we get rid of the denominator. And the calculation to this point, we found that the optimization problem, theoretically can be obtained by gradient descent, kuhn-tucker conditions and other algorithms, but too difficult, which is the EM algorithm proposed by the motive.
According to the supplementary knowledge Point two , it is concluded that log (x) is a concave function, and then according to the supplementary knowledge Point three Jensen inequality , we get:
,
More interesting is, when (the following formula will be familiar, is the EM algorithm e-step)
, the equals sign is established.
E step, launched, then M step is good to do, I no longer carefully pushed, directly give M step:
Note: In this step, we remove the denominator. Why remove, and why convergence, I will not wordy. Schematic diagram of the iteration step:
The above is "God's algorithm"
————————————————————————————————————————————————————————————————————————————
Learn about EM algorithm reference:
1. "Book":P the Nineth chapter of Attern Recognition and machine learning, which also discusses why the K-means clustering algorithm is also a special case of the EM algorithm
2.Andrew ng Teacher's handout, and public lessons video tutorial