**01 Brief Introduction**

The probability graph model is the product of the combination of graph theory and probability theory. It was created by Judea Pearl, a famous one. I like the probability graph model tool very much, it is a powerful multi-variable and visualized modeling tool for variable relations, mainly including two major directions: undirected graph model and directed graph model. An undirected graph model is also called a Markov network. It has many applications, such as typical Image Processing Based on Markov Random Field, image segmentation, and stereo matching, it also has a structured learning method that combines with machine learning to obtain model parameters. Strictly speaking, they are all evaluating the Posterior Probability: P (Y | X), that is, the probability of determining the y of each tag Based on the given data, finally, the label with the highest posterior probability is selected as the prediction result. This process is also called probabilistic

Inference ). Digraphs are also widely used. digraphs, also known as Bayes networks, can predict the application scope of this model, such as medical diagnosis and most machine learning. However, it also has some controversy. When it comes to this, it will go back to the topic of debate between the Bayesian School and the frequency School for several hundred years, because the Bayesian school assumes some prior probabilities, in contrast, the frequency school thinks that this anterior is somewhat subjective, and the frequency school thinks that the parameters of the model exist objectively. Assuming that the prior distribution is somewhat arbitrary, the prediction results of Bayesian models are somewhat "moisture ", it is not applicable to strict fields, such as precision manufacturing and legal industries. Well, if we do not follow the Bayesian perspective, all the machine learning models mentioned above can be dismiss. We can use a large amount of data statistics to make up for this "defect. An example of an undirected graph and a directed graph is shown in Figure 1:

Figure 1 (a) undirected graph (Hidden Markov) (B) Directed Graph

The probability graph model draws on the strengths of graph theory and probability. graph theory plays an important role in many computing fields, such as combination optimization, statistical physics, and economics. Each node in the figure can be regarded as a variable. Each variable has n States (value range), and the edge between nodes indicates the relationship between variables. In addition to the language used to build models, the figure can also evaluate the complexity and feasibility of the model. The running time of an algorithm or the magnitude of the error boundary can be analyzed by the structural nature of the graph, in fact, many questions in the engineering field can be expressed in Graphs and eventually converted into a search or query question. The goal is to quickly locate the target. Is there any question that is not a search question? The tree is a graph, the traveling salesman problem is based on the graph, and the dyeing problem is based on the graph. They have different structural properties of the graph. We can estimate the time complexity of the tree. The reasoning method at the beginning of the probability graph model also utilizes the structural nature of the graph, converts a complex graph into a tree (easily processed graph) to solve the problem. This algorithm is called the joint tree algorithm (junction ).

Tree Algorithm ). In simple terms, the joint tree algorithm is to use the independent nature of the graph conditions to break down the graph, then organize the graph in the form of a tree, and finally use the good operability of the tree for inference (message transmission ), the joint tree algorithm is one of the important Inference Algorithms in subsequent chapters. However, not all images can be split into trees. Generally, only images with suitable sparse edges can be converted into trees. Therefore, the Federated tree algorithm has certain limitations. We cannot build a consortium tree. How can we solve it by reasoning? Fortunately, there are several other methods. The second component of the probability graph model: probability. Since it enters the probability space, the Monte Carlo (MCMC)-based sampling method can also be used, the maximum posterior probability is obtained. The MCMC solution is also a frequently used tool, and there are some mean field (mean

Field) and variational method. The algorithm complexity of these solutions is shown in Figure 2:

(Figure 2) Comparison of inference algorithm performance

(Figure 2) several solutions are also commonly used in probability graph models or statistical machine learning. Both MCMC and Variational Methods originate from statistical physics, recently, deep learning is also an application of the probability graph model. Even if you want to oppose it, I also want to place it under the probability graph model ^. ^, RBM, crbm, and DBM are all very special probability graph models. The entire concept, from modeling to solving, is centered on obtaining the maximum posterior probability of graph model nodes. The MCMC solution won't be mentioned. There are already a lot of examples used in deep learning, while the variational method is very attractive, michael Jordan, Andrew Ng's master, thinks that the promising method to use the variational method in statistical machine learning is to link the Convex analysis with the polar family distribution function. Since there is a convex analysis, convex

Relaxtion), because the data is discrete, you can refer to the previous blog on Convex relaxation. There are also some solutions specific to the graph model, such as belief propagation (Sum-product algorithm) N or expectationpropagation.

The above is a brief introduction to the probability graph model. The next blog post is to record the study notes according to the following outline. If we try not to violate the honor code, I used Professor Daphne Koller's assignment to help introduce it.

I. Introduction to graph model definition

Ii. using extreme family and Convex analysis based on the variational method to calculate the edge Probability

Iii. Approximate Solution methods, such as belief-propagation and expectation-Propagation

Iv. Mean Field solutions and other optimization solutions

5. Structured learning

References:

[1] graphical models, exponential families, and variational inference. Martin J. Wain Wright and Michael I. Jordan

[2] probabilistic graphical models. Daphne Koller

[3] The Design and Analysis of computer algorithms. Alfred v. Aho