From: http://blog.csdn.net/xianlingmao/article/details/5774435
Graphical model is a general term for a type of technology that uses graphs to represent probability distribution.
Its main advantage is to express the conditions in the probability distribution independently in the form of graphs, so that a probability distribution (specific, and application-related) can be expressed as the product of many factors, this simplifies the calculation of a probability distribution on the edge. Here, the edge refers to the probability distribution of given n variables and the calculation of the probability distribution of M variables (M <n ).
There are two main types of graph models: Bayesian Networks (digraph models) and Markov networks (undirected graph models ).
When talking about a graph model, there are three main concerns:
1)Representation); Refers to what a graph model should look like.
2)Inference)It refers to how to calculate the probability of a query when the graph model is known. For example, we have observed some nodes to calculate the probability of other unknown nodes.
3)Learning)There are two types: Graph Structure Learning and graph parameter learning.
In this article, we will focus on the representation of the graph model. In future articles, we will discuss other aspects of the graph model.
1. Representation of directed graph models
As the name implies, the structure representation of a directed graph model is in the form of a directed graph. A directed graph is used to represent a probability distribution and thus can be used for inference.
For a directed graph model, how does one express a probability distribution through a directed graph?
For a probability distribution P (x1, x2,..., xn), we can use the chain theory in probability theory to write it as a factor.
P (x1, x2 ,..., XN) = P (X1) P (X2 | X1) P (X3 | x1, x2 ).... P (XN | X1, x2 .... X _ (n-1 ))
This is a general form of probability distribution. When a specific probability distribution occurs, many random variables are independent or the conditions are independent.
The preceding formula is further simplified. For example, if X3 and X1 are independent under a given X2 condition, P (X3 | x1, x2) = P (X3 | X2 ). In the simplified conditional probability distribution, we create a directed graph for each factor. Each random variable corresponds to a node of the graph, and then for each factor, each random variable node in its condition division points an edge to a non-conditional variable node. After all the factors are completed, a directed graph model can be formed. This may be too abstract. I will discuss its principles with a specific example below.
Assume that the probability distribution P (x1, x2, X3) = P (X1) P (X2 | X1) P (X3 | X1)
Then its directed graph model can be expressed
Instead, given a directed graph, we can directly write the probability distribution represented by this graph. You can try to write its probability distribution from the above graph.
Formally, the probability distribution represented by a directed graph model can be written as: P (X) = IIP (XI | PA (XI), whereXRepresents the vector of the random variable, II represents the product, PA (xi) represents the Father's Day of Xi.
From the above description, we can see that to fully represent a probability distribution, on the one hand, we need to know its topological structure, that is, its graphical structure;
On the other hand, we also need to know the distribution of various factors of probability distribution, that is, P (XI | PA (XI) in the above formula needs to know.
What is the form of a complete Directed Graph Model with another graph?
Each node in the preceding figure has a conditional probability distribution table (CPT), which is a parameter of the directed graph model, namely P (XI | PA (XI )).
2. Representation of undirected graph models
The undirected graph model is similar to the directed graph model to represent a probability distribution. At the same time, the condition between variables must be independently encoded in the graph representation, in this way, the representation of probability distribution can be expressed as the product of factors. The difference is that the undirected graph model is built on an undirected graph, while the directed graph model is built on a directed graph.
Let's first look at an example:
Is a complete representation of an undirected graph model. The left side is its topology and the right side is its parameters.
The undirected graph model is centered on the potential function defined on the largest group. Specifically, in this example, it has four groups: AC, AB, BD, CD. Then we need to define the corresponding potential energy function on the four groups, as shown on the right side. Note that the potential energy function must be positive.
In the end, the probability distribution of this undirected graph model is P (A, B, C, D) = (1/z) */PHI (a, c) */PHI (, b) */PHI (c, d) */PHI (B, d)
Z is the normalization factor, because the potential energy function is not normalized, and the probability is [], so normalization is required; /PHI indicates the corresponding potential energy function (because it cannot represent mathematical symbols, it is represented by symbols in latex ).
Therefore, the probability distribution represented by an undirected graph model can be formally expressed:
P (X) = (1/z) * II _ {I = 1} ^ {n}/PHI (CI (x )), the I-TH group represented by CI (all use symbols in latex to represent mathematical formulas.
Iii. Summary
Whether directed graph model or undirected graph model, we need to pay attention to its two aspects, one is to determine its structure; the other is to determine its parameters. For directed graph model, the conditional probability table needs to be determined. For an undirected graph model, the potential energy function of each group needs to be determined.