Bayesian Network Summary

Source: Internet
Author: User

Weekend to colleagues to share the Bayesian network, every time after the sharing of things are not recorded feel very pity, so the preparation of the sharing process of some notes, materials, key points written down as articles.


1. Definition of Bayesian networks

A Bayesian network is a directed acyclic graph (DAG) whose nodes represent a variable, and the edge represents the relationship between variables, and the node stores the conditional probability distribution of the node equivalent to its parent node.


Each of these nodes is affected by its parent node, that is, its parent node represents the reason, and the child node represents the result.

The mathematical description is that the joint probability distribution of each variable in the Bayesian network equals the product of each node with its parent as the conditional probability.

That



2. Derivation of Bayesian networks

The derivation of Bayesian networks is the answer to all possible probability problems in the Bayesian network, in which, for example, it is possible to answer arbitrary probability problems such as P (x2=0), P (x3=1|x2=0), P (x2=0,x3=1,x4=0).

(1) Precise derivation

A simpler Bayesian network can use the exact derivation method. According to the structure of Bayesian network, we can find the joint probability distribution, then we can introduce the probability form of any on this network according to the full probability formula and Bayesian formula.

Such as Bayesian networks are as follows:


The derivation of the probability problem is as follows:


The exact derivation can be optimized using dynamic programming in the calculation process (such as elimination method), or it can be optimized based on the knowledge of some graph theory (such as a group-based derivation method).

(2) Fuzzy deduction

Sometimes the Bayesian network is too large to use fuzzy derivation.

There are many ways of fuzzy derivation, and here is how to use the Gibbs sample in MCMC (Markov chain Monte Carlo) to derive.

I, sample

The sample is composed of observation data and unknown data, namely x1,x2,?, X3,?.... xn, in which the unobserved data is expressed, the purpose of inference is to find the unknown node under the observed value of the probability distribution, that is P (? | X1,x2. xn).

II, Markov blanket

Markov Branket in Bayesian networks refers to the parent node of a node x, the child node, the parent node of the child node (excluding itself), and the Markov blanket in MB (x) to represent node X in the following description.

III. Algorithm Flow

Initialization: Initializes the conditional probability distribution of the unknown variable, sampling it according to the distribution and assigning values to the unknown node.

(1) Random selection of unknown nodes

(2) According to the condition probability distribution of the unknown node, the node is assigned a value.

(3) Recalculate the node's distribution P (?) =p (? | MB (?))

(4) Returns the iteration (1) until it converges.


3. Training of Bayesian networks

(1) The structure is known, the sample is complete

Use the method of maximum likelihood estimation (if discrete values use statistical methods) to get the conditional probability distribution of each node.

(2) The structure is known and the sample is incomplete.

If there is a node that cannot be observed (that is, the sample is incomplete), you can use the EM method to train, the approximate process is as follows:

Initialization: Conditional probability distributions for random nodes

E-step: According to the existing conditional probability distribution of each node, complement the sample (if the continuous complement is the mean, the discrete complement is the highest probability of occurrence of the value)

M-step: A new probability distribution of each node is obtained by using the maximum likelihood estimate or statistic according to the "complete" observation value, replacing the original value.

(3) Unknown structure

There are roughly three ways to get the Bayesian network structure:

I, by the expert modeling.

II. Using the correlation-based network training method

The general idea is to calculate the correlations of each variable (such as mutual information, chi-square test, etc.), then establish the edge between the nodes with large correlations, and then determine the direction of the edges by the degree of the sample fit.

Iii. Scoring-based approach

First establish the scoring function, such as MDL, a scoring function to describe the good or bad of a Bayesian network, usually consider the network structure (the simpler the better) and the degree of fitting with the sample (the larger the better fit).

Secondly, a heuristic algorithm (such as simulated annealing) is used to retrieve the whole network structure space, and a local optimal value is searched as the result of the algorithm.

Bayesian Network Summary

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.