Dirichlet process and Dirichlet process mixture model
[This article link: http://www.cnblogs.com/breezedeus/archive/2012/11/05/2754940.html, reprint please indicate the source. ]
Dirichlet Process (DP) is known as the distribution of distributions. Each sample (a function) extracted from the DP can be thought of as a distribution function of a discrete random variable, which takes a value from a non-0 probability value at an infinite number of discrete points. Interestingly, several well-known issues can be deduced from DP: Chinese Restaurant process (CRP), Polya Urn scheme, and stick-breaking process. A brief introduction can be found in Edwin Chen's blog "Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process".
The characteristics of DP enable it to be used as a priori distribution of parameters in a non-parametric Bayesian cluster model. Dirichlet Process Mixture (DPM) is a typical representation of this non-parametric Bayesian clustering model. DPM can be considered as a generalization of the finite hybrid (finite mixture,fm) model, where FM (such as the Gaussian Mixture model) must first give the number of classes, while DPM does not, and it can determine the number of classes based on the data itself. In theory, the number of classes for DPM increases with the log (number of sample points) growth. At present, researchers have proposed many algorithms for training DPM, from Gibbs sampling, to collapsed Gibbs sampling, to variational methods. I realized the collapsed Gibbs sampling method, the speed is a big constraint, run Big data very laborious. Another problem with DPM is that its number of classes is automatically controlled by the algorithm (although there is a hyper-parameter alpha that can roughly regulate the number of classes), the resulting number of classes may differ greatly from expectations.
If you want to learn more about DP and DPM, you can go to Yee W Teh's homepage to see a lot of relevant papers,slides,presentations, as well as the DPM open source software written in MATLAB. To learn more about the various algorithms and specific derivation of DPM, it is recommended to take a look at Xiaodong Yu's blog, which also contains a very detailed study note (although there are some small clerical errors), as well as more references. I have also written a summary, but do not bother to use latex to fight out, to picture packaging in the way in the net disk, only the last page of the reference to paste the following. Those references can be downloaded directly from Google. Students who are not interested in the theory, please ignore it, haha.
Dirichlet process and Dirichlet process mixture model