Mahout Series: Dirichlet distribution

Last Update:2017-02-27 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Dirichlet distributions can be seen as distributions above the distribution. How to understand this sentence, we can give an example: suppose we have a dice, it has six sides, respectively, {1,2,3,4,5,6}. Now we have done 10,000 throw experiments, the results of the experiment is six surface respectively, {2000,2000,2000,2000,1000,1000} times, if the number of each side and the ratio of the total number of tests to estimate the probability of this surface, then we get six surface probability, is {0.2,0.2,0.2,0.2,0.1,0.1} respectively. Now, we are not satisfied, we want to do 10,000 trials, each test we throw dice 10,000 times. We want to know that this situation makes us think that the probability of the dice six surface probability is {0.2,0.2,0.2,0.2,0.1,0.1} is how much (maybe the probability of the next test statistics is {0.1, 0.1, 0.2, 0.2, 0.2, 0.2}). So we're thinking about the distribution of the distribution on the six side of the dice where the probability distribution occurs. And such a distribution is Dirichlet distribution.

First use the above paragraph to make a visual impression, and then list some information:

Vikiri face in the Dirichlet distribution seems to introduce a very complex, not enough foundation. I found a CMU ppt:dirichlet distribution, Dirichlet process and Dirichlet process mixture, found a University of Washington's Introduction to the Dirichlet Distribution and Related processes introduced.

Found CMU that PPT, the Beta is the conjugate prior of binomial, there is a sense of it. Well, the original beta distribution is the conjugate prior distribution of two distributions, then the Dirichlet distribution is the conjugate prior distribution of the multiple distributions. So to see Dirichlet distribution, we must first understand a number of distribution, and then, to understand the relationship between Dirichlet, we must first look at the beta distribution and Bernoulli distribution relationship. So, the two-item distribution, beta distribution, and conjugate three points are the key basics of understanding Dirichlet distribution, and this is where the basics are recorded (PRML2.1 the whole Chang).

Below formally enters the Dirichlet distribution introduction, first said this multiple distribution parameter μ. In Bernoulli's distribution, the parameter μ is the probability of tossing a coin to take a certain side, because Bernoulli's distribution of state space is only {0,1}. But in multiple distributions, because the state space has k values, the μ becomes the vector μ= (μ1, ..., μk) T. The form of the likelihood function of a polynomial distribution is ∏K=1KΜMKK, so the function form of the Dirichlet distribution should be as follows when choosing the conjugate transcendental beta function of the Bernoulli distribution:

P (μ|α) ∝∏k=1kμαk1k type 2.37

In the formula, ∑kμk=1,α= (α1, ..., αk) T is the parameter of the Dirichlet distribution. Finally, the 2.37 normalization becomes the true Dirichlet distribution:

Dir (μ|α) =γ (α0) Γ (α1) ... Gamma (αk) ∏k=1kμαk1k

Which α0=∑k=1kαk. This function is a bit like the beta distribution (Beta distributions are taken when k=2). It's a bit like a lot of distribution. Like the beta distribution, the Dirichlet distribution is the distribution of the parameter μ of the corresponding posterior polynomial distribution, except that μ is a vector, and the following figure is an example of a Dirichlet probability density function when μ= (Μ1,Μ2,Μ3) is only three values. The triangle in the middle of the diagram represents a flat simplex, the three vertices of the triangle represent μ= (1,0,0), μ= (0,1,0) and μ= (0,0,1), so any point in the middle of the triangle is a value of μ, and the longitudinal axis is this μ The probability density value (PDF) on the simplex.

For the estimation of the parameter μ, we can know that the Posteriori = likelihood * Transcendental function form is as follows:

P (μ| D,α) ∝ (d|μ) p (μ|α) ∝∏k=1kμαk+mk1k

From this form, we can see that the posterior examination is also a Dirichlet distribution. Similar to the beta distribution normalization of the posterior approach, we put this back to the test, we get:

P (μ| D,α) =dir (μ|α+m) =γ (α0+n) Γ (Α1+M1) ... Gamma (ΑK+MK) ∏k=1kμαk+mk1k

See more highlights of this column: http://www.bianceng.cnhttp://www.bianceng.cn/Programming/sjjg/

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Mahout Series: Dirichlet distribution

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Mahout Series: Dirichlet distribution

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support