GMM (Gaussian mixture model, Gaussian mixture models)

GMM (Gaussian mixture model, Gaussian mixture models) _GMM

Last Update:2018-08-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The full name of GMM is Gaussian mixture model (Gaussian mixture). Similar to the K-means algorithm, GMM is a common clustering algorithm, which differs from K-means mainly because GMM is a "soft clustering" algorithm, through which we can get the probability that each sample belongs to each center point. Because of its nature, GMM has been widely used in image segmentation and speech processing.
The K-center point can be obtained by executing K-means for n sample data, and the K Gaussian distribution will be obtained after the GMM is executed. Use Formula 1 to represent a Gaussian distribution in which Theta represents a positional parameter relative to the Φ (x∣θ) (θ can represent expectations or standard deviations).

Φ (x∣θ) =12π√σe− (x−μ) 22σ2 Formula 1

As GMM obtains the K Gaussian distribution, the result can be expressed as a Formula 2.

P (x∣θ) =∑kk=1wkϕ (x∣θk) Formula 2

WK for Φ (x∣θk) is the probability of being selected, so there is ∑kk=1wk=1; Wk≥0 the next task is to estimate the optimal set of parameters μ, θ, and w, which is solved using the maximum likelihood estimator. Before solving, we need to make a stronger assumption that all sample data are independent of each other. Then we can get the logarithmic likelihood estimator as shown in Formula 3.

L (θ) =∑ni=1logp (xi∣θ) =∑ni=1log∑kk=1wkϕ (xi∣θk) Formula 3

The optimal value to be estimated is θ^=argmaxθl (θ) for the equation shown in Formula 3, it is difficult to obtain the maximum value by the direct derivation order minus zero. Therefore, using the EM algorithm to solve this problem, the principle of EM algorithm can refer to Appendix A.
Here the Jensen inequality is used to find the lower bound function, and the description of the Jensen inequality can be referred to in Appendix B. Since function f (x) =log (x) is a concave function, it is here to reverse the symbol of the original Jensen inequality. The Γik represents the probability that the sample Xi belongs to the K Center, as shown in Formula 4. So the derivation shown in Formula 5 can be obtained.

Γik=wkϕ (xi∣θk) ∑kk=1wkϕ (xi∣θk) Formula 4 L (θ) =∑ni=1log∑kk=1wkϕ (xi∣θk) =∑ni=1log∑kk=1γikwkϕ (xi∣θk) Γik =∑Ni=1logE[wkϕ (xi ∣θk) Γik]≥∑ni=1e[logwkϕ (xi∣θk) Γik] =∑ni=1∑kk=1γiklogwkϕ (xi∣θk) Γik Formula 5

To facilitate subsequent derivation, use Formula 6 to represent the portion to the right of the above inequality.

H (w,μ,σ) =∑ni=1∑kk=1γik[(xi−μk) 2σ3k−1σk] Formula 6

To μk the H respectively, σk to find the partial guide

∂h (w,μ,σ) ∂σk=∑ni=1γik[(xi−μk) 2σ3k−1σk]∂h w,μ,σ (∂μk=∑ni=1γikxi−μkσ2k)

The countdown to 0 gives the maximum value in the current iteration, at which point

Σ2k=∑ni=1γik (xi−μk) 2∑ni=1γikμk=∑ni=1γikxi∑ni=1γik

For WK, the Lagrange multiplier method can be used to calculate the maximum value, where the calculation process is not listed directly to give the result

Wk=1n∑ni=1γik

The new μk, σk, and WK are brought into the H (w,μ,σ) to obtain a new value, and whether or not the iteration is stopped by comparing whether or not it is less than the threshold value. After stopping the iteration, we get the K Gaussian distribution of the final satisfying condition and its parameters. Appendix A EM algorithm

For a strict concave function F (θ) that requires its maximum value, you can first define a lower-bound function gθt (θ) ≤f (θ) at θt, when and only when θ=θt gθt (θ) =f (θ). So Θt+1=argmaxθgθt (theta), then there must be a lower form.

F (θt+1) ≥gθt (θt+1) ≥gθt (θt) =f (θt)

Then we define a lower-bound function gθt+1 (θ) at θt+1, and then iterate over and over again, Θt will eventually approach θ^. Appendix B Jensen Inequalities

If f (x) is a convex function, the following inequalities are established

E[f (x)]≥f[e (x)]

from:http://www.duzhongxiang.com/gmm/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

GMM (Gaussian mixture model, Gaussian mixture models) _GMM

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

GMM (Gaussian mixture model, Gaussian mixture models) _GMM

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support