Read the first few sections of a little tail Dirichlet distribution 1 Review beta distribution 2 Dirichlet distribution Dirichlet distribution derivation of beta distribution and Dirichlet distribution 3 How to better understand this distribution
0. Read the instructions
The necessary and minimal knowledge that is closely related to LDA is the body of the blog post. The contents of the gray box are expanded and supplemented, and skipping directly will not affect your understanding. A gray box refers to a paragraph of the following form:
This is a gray box to indicate the paragraph
This part of the content is supplementary content, skipping directly will not affect your understanding 1. A little tail in the first few sections
In the first section, when we deduced the gamma distribution from two distributions, we used the following equation:
P (c≤k) =n!k! (n−k−1)!∫1PTK (1−t) n−k−1dt,c∼b (n,p) p (C \le k) = \frac{n!} K! (N-K-1)!} \int_p^1 t^k (1-t) ^{n-k-1} DT, \quad C\sim B (n,p)
Now you can see that the left is the cumulative probability of two distributions, and the right is actually the probability integral of the beta (t|k+1,n−k) \beta (t|k+1,n−k) distribution. This is not a proof of the formula, so let's prove it.
We can construct two distributions as follows, take random variables to see X1,x2,⋯,xn∼iiduniform (0,1) x_1,x_2,\cdots,x_n\sim^{iid}uniform (0,1), a successful venue resident-Lee experiment is xi<p x_i
Obviously, we have the following formula set up:
P (c≤k) =p (X (k+1) >p) p (C \le k) = P (x_{(k+1)} > P)
Here X (k+1) x_{(k+1)} is the ordinal statistic, which is a large number for the k+1 k+1. On the left side of the equation, the maximum number of successful trials is K K, and the right side indicates the first