The probabilistic graphical model series is explained by Daphne Koller In the probabilistic graphical model of the Stanford open course. Https://class.coursera.org/pgm-2012-002/class/index)
Main contents include (reprinted please indicate the original source http://blog.csdn.net/yangliuy)
1. probabilistic Graph Model Representation and deformation of Bayesian Networks and Markov networks.
2. Reasoning and inference methods, including Exact Inference (variable elimination, clique trees) and approximate inference (belief propagation message passing, Markov Chain Monte Carlo methods ).
3. Learning Methods for parameters and structures in the probability graph model.
4. Use the probability graph model for Statistical Decision modeling.
Lecture 1. Bayesian Network Basics
1. Bayesian network definition
A Bayesian Network is a directed acyclic graph with nodes representing random variables and edges representing the probability relationship between random variables. The joint probability distribution can be expressed by Bayesian chain rules.
Parg (xi) indicates the random variable corresponding to the parent node of node XI in Figure G.
2. Flow of probability impact in Bayesian Networks)
The mobility of probability impact reflects the independence of random variable conditions in Bayesian Networks, as shown in
The Bayesian Network Model in the figure reflects the relationship between the following four random variables.
Difficulty course difficulty
Intelligent students
Grade student course score
SAT score
Can letter students receive letter of recommendation from professors?
In the six cases on the left, the probability of X under X → W ← y does not affect the probability of Y. This is because W is not an observed variable and its value is unknown. Therefore, the value of random variable X does not affect the value of random variable Y. Interestingly, when the W variable becomes the observed variable, the above conclusion changes. As shown in
When w є Z, that is, W is the observed variable, all judgments become opposite. Take X → w y as an example. The value of W is known. For example, if the grade of a student is B, the intelligence of the student and the difficulty of the course are no longer independent. For example, in this case, if the course is relatively easy and the students over there are less likely to be smart, otherwise, if the course is difficult, the students have a high probability of being smart. In other cases, we can use the Bayesian network example on the right.
3 active Trails
After the analysis in part 1, we can summarize the following conclusions:
Atrail x1... -XN is active if:
It has no V-Structures
Atrail x1... -XN is active given Z if:
-For any V-structure Xi-1 → Xi others Xi + 1 we have thatXI or one of its descendantsε z
-No other Xi is in Z
Apparently, if x1... -XN is active, so X1 and XN are no longer conditional independent.
4 independence in graph
Here we will summarize the situation of D-separate, as shown in