The PLSA model is based on the idea of frequency, each document's K-theme is fixed, each topic of the word probability is also fixed, we ultimately require a fixed topic-word probability model. Bayesian School Obviously does not agree, they believe that the subject of the document is unknown, the topic of the distribution of words is not known, we can not solve the exact value, can only calculate doc-topic probability model, topic-word probability model of probability distribution.
LDA Model Document Generation process
We make the Doc-topic probabilistic model for the Topic-word probabilistic model, each containing a K-dimension (K for the number of topic classes), each containing a V-dimension (V is the number of Word classes). In pLSA, the Doc-topic Dice is a K-term experiment, the Topic-word dice is a v-term experiment, so we use the K-dimensional Dirichlet distribution simulation of the prior distribution, v-dimensional Dirichlet distribution simulation of the prior distribution is a natural thing. Using the Bayesian remolding pLSA model, the new model is called the LDA model. The LDA model document generation legend is as follows
Since the Topic-word model is document-Independent, we generate K Topic-word dice from the Dirichlet distribution before all documents are generated. While the DOC-TOPIC model is related to each document, it is necessary to generate 1 Doc-topic dice from the Dirichlet distribution before generating each document. The LDA model generation document process is as follows
Physical decomposition of LDA model
Physical process decomposition
For the generation of the nth word in the article M document, we can decompose it into the following two processes
1.
Generate a Doc-topic dice m by Dirichlet (α) distribution, throw a doc-topic dice M, perform a K-key experiment, generate topic Z (1 <= z <= K).
2.
K-Topic-word Dice (numbered 1 to K) have been generated in advance by the Dirichlet (Beta) distribution, and the Z-Dice throw is selected for the V-key experiment to generate the word W.
Mathematical description of LDA model
For the first physical process is obviously dirichlet-multinomial conjugate structure
Compare the formula below (the formula is too lazy to knock, copy "Lda math gossip", is actually polynomial distribution in the Dirichlet distribution on the integral)
We have
Which indicates that the number of words in section M document K topic (that is, the number of occurrences of the K-topic throw, n is unknown to us). The posterior distribution of the parameters is
Because M-documents are independent of each other in corpus, we get M-independent dirichlet-multinomial conjugate structure, thus the probability of the whole corpus topic generation is
(1)
Since the probability distribution of topic-word is independent of the number of doc, we have K-Dirichlet distribution for K-Topic-word dice, so we should have a K-dirichlet-multinomial conjugate structure.
The LDA process in front of us is not good to find K-Topic-word Dice V-type experiment, we make a change.
1. Each word of each article is carried out one time doc-topic Dice polynomial throw experiment and one topic-word dice polynomial throw experiment.
Revision changed to
2. Each article first n times doc-topic Dice polynomial throw experiment, and then the n - time topic-word Dice-throwing experiment. (n is the number of words in an article)
Further modifications
3. The whole corpus advanced row n times doc-topic Dice polynomial throw experiment, and the experimental results into K class, each class corresponding to the same topic Results, and then in the K class, respectively , topic-word Dice polynomial throw experiment, a total n times. (n is the number of words for the entire corpus)
Here in the K class, respectively, the Topic-word dice polynomial throw experiment, is obviously K Topic-word Dice v-type experiment. The above procedure can be represented by the following two expressions
Z represents the results of the M-Doc-topic polynomial throw experiment (each article is sorted by a doc-topic dice, the results are classified), and W represents the K-Topic-word polynomial throw experiment (each subject with a Topic-word dice).
So a second physical process is also a dirichlet-multinomial conjugate structure
We have
Which indicates that the K-topic produces the number of Word t (n unknown), the posterior distribution is
Since the K-topic generate word is independent, so we get K independent dirichlet-multinomial conjugate structure, so the whole corpus word generation probability is
(2)
Since the generation of topic and word is independent, the synthesis (1) (2) has
Reference: "Lda math Gossip"
LDA Thematic model