Bayesian Network Summary

Last Update:2014-12-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Weekend to colleagues to share the Bayesian network, every time after the sharing of things are not recorded feel very pity, so the preparation of the sharing process of some notes, materials, key points written down as articles.

1. Definition of Bayesian networks

A Bayesian network is a directed acyclic graph (DAG) whose nodes represent a variable, and the edge represents the relationship between variables, and the node stores the conditional probability distribution of the node equivalent to its parent node.

Each of these nodes is affected by its parent node, that is, its parent node represents the reason, and the child node represents the result.

The mathematical description is that the joint probability distribution of each variable in the Bayesian network equals the product of each node with its parent as the conditional probability.

That

2. Derivation of Bayesian networks

The derivation of Bayesian networks is the answer to all possible probability problems in the Bayesian network, in which, for example, it is possible to answer arbitrary probability problems such as P (x2=0), P (x3=1|x2=0), P (x2=0,x3=1,x4=0).

(1) Precise derivation

A simpler Bayesian network can use the exact derivation method. According to the structure of Bayesian network, we can find the joint probability distribution, then we can introduce the probability form of any on this network according to the full probability formula and Bayesian formula.

Such as Bayesian networks are as follows:

The derivation of the probability problem is as follows:

The exact derivation can be optimized using dynamic programming in the calculation process (such as elimination method), or it can be optimized based on the knowledge of some graph theory (such as a group-based derivation method).

(2) Fuzzy deduction

Sometimes the Bayesian network is too large to use fuzzy derivation.

There are many ways of fuzzy derivation, and here is how to use the Gibbs sample in MCMC (Markov chain Monte Carlo) to derive.

I, sample

The sample is composed of observation data and unknown data, namely x1,x2,?, X3,?.... xn, in which the unobserved data is expressed, the purpose of inference is to find the unknown node under the observed value of the probability distribution, that is P (? | X1,x2. xn).

II, Markov blanket

Markov Branket in Bayesian networks refers to the parent node of a node x, the child node, the parent node of the child node (excluding itself), and the Markov blanket in MB (x) to represent node X in the following description.

III. Algorithm Flow

Initialization: Initializes the conditional probability distribution of the unknown variable, sampling it according to the distribution and assigning values to the unknown node.

(1) Random selection of unknown nodes

(2) According to the condition probability distribution of the unknown node, the node is assigned a value.

(3) Recalculate the node's distribution P (?) =p (? | MB (?))

(4) Returns the iteration (1) until it converges.

3. Training of Bayesian networks

(1) The structure is known, the sample is complete

Use the method of maximum likelihood estimation (if discrete values use statistical methods) to get the conditional probability distribution of each node.

(2) The structure is known and the sample is incomplete.

If there is a node that cannot be observed (that is, the sample is incomplete), you can use the EM method to train, the approximate process is as follows:

Initialization: Conditional probability distributions for random nodes

E-step: According to the existing conditional probability distribution of each node, complement the sample (if the continuous complement is the mean, the discrete complement is the highest probability of occurrence of the value)

M-step: A new probability distribution of each node is obtained by using the maximum likelihood estimate or statistic according to the "complete" observation value, replacing the original value.

(3) Unknown structure

There are roughly three ways to get the Bayesian network structure:

I, by the expert modeling.

II. Using the correlation-based network training method

The general idea is to calculate the correlations of each variable (such as mutual information, chi-square test, etc.), then establish the edge between the nodes with large correlations, and then determine the direction of the edges by the degree of the sample fit.

Iii. Scoring-based approach

First establish the scoring function, such as MDL, a scoring function to describe the good or bad of a Bayesian network, usually consider the network structure (the simpler the better) and the degree of fitting with the sample (the larger the better fit).

Secondly, a heuristic algorithm (such as simulated annealing) is used to retrieve the whole network structure space, and a local optimal value is searched as the result of the algorithm.

Bayesian Network Summary

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Bayesian Network Summary

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support