Structural inference of hierarchies in Networks (Network hierarchy Inference)

Source: Internet
Author: User

Structural inference of hierarchies in Networks (network hierarchy Inference) 1. Problem

Hierarchical structure is a kind of important complex network property. This paper gives a precise definition of hierarchical structure, gives a probabilistic model of generating arbitrary hierarchies in random graphs, and gives a statistical method of inferring hierarchical structure from real-world complex networks. (definition, model and inference method); Finally, using the inferred probabilistic model, more network data (null model) is generated, which is used to annotate and hypothesis-test the elements (vertices, edges) of the network.

Input: Network data (Edge and vertex set) output: Vertex hierarchy method: Maximum likelihood estimation (ML), Markov chain Monte Carlo method (MCMC) has the defect of the method:

Existing methods (such as hierarchical clustering) cannot guarantee that the inferred model is an unbiased estimate of the real model. In other words, there is no guarantee that the inferred hierarchical model can reflect the hierarchical structure of the real-world network, and how much of the inferred hierarchy is affected by the nature of the inference algorithm.

2. Theoretical basis: Maximum likelihood estimation (ML)

A maximum likelihood estimate is commonly used to estimate model parameters.

Markov chain Monte Carlo method (MCMC)

MCMC is a sampling method. Here, we use the hierarchical model of the network as a random parameter vector. The sampling of this random vector is equivalent to the sampling of the model. After the convergence of the MCMC algorithm, the probability of each model being pumped is proportional to the likelihood of the real-world network generated by this model.

Bayesian model Average

A set of hierarchical models can be obtained by MCMC sampling. The last single model can be obtained by averaging the Bayesian model of this set of models. Bayesian model averaging is a method for estimating the parameters and obtaining the estimation of the final individual parameters after the Bayesian method obtains the parameter distribution.

3. Method hierarchy definition (definition)

The hierarchy referred to here refers to dividing the nodes in the network into groups, then dividing each group of nodes into subgroups, and making such recursive divisions until each subgroup contains only a single node. Such a structure is usually represented as a tree structure. Such a tree structure can be represented as D = {d1,d2, ..., dn?1}, Di represents an intermediate node, representing a set of nodes.

Stochastic graph model (model) for hierarchical structure

LH represents the maximum likelihood function of the model, EI represents the left subtree of the intermediate node of the hierarchy tree, RI represents the right subtree, and θi represents the probability that the nodes between the left and the sub-trees have edge connections. When the maximum likelihood function is biased and the other partial derivative is zero, the θi= Ei/liri is obtained and the likelihood function obtains the maximum value.
For a particular hierarchical tree D, it is easy to estimate the model parameter θ, but it is not easy to find the best one from all possible hierarchical trees. Next, the possible hierarchies are extracted using the MCMC method, so that the probability that each hierarchy is pumped is proportional to the maximum likelihood that the structure generates real-world network data.

Sampling methods for hierarchies (inference)

The sampling method of this paper uses the metropolis-hastings algorithm in the MCMC method. Each state in the Markov chain represents a hierarchical tree, which shows how the states in the chain are transferred to each other. For the first diagram, the node represents an intermediate node in the tree, which has two subtrees, A and b;c, respectively, representing the sibling nodes of this node. Therefore, for a particular node of such a hierarchical tree, the state of the Markov chain has two different orientations, corresponding to the second and third diagrams. During the sampling process, an intermediate node (uniform selection) is selected at random, and a candidate transition state is selected from the second and third graphs randomly. From the perspective of transfer, Markov chains are clearly traversed. You can use the following method to make the state transition satisfy the detailed stationary condition (MH algorithm standard way): If the maximum likelihood value of the transferred hierarchy tree is greater than the maximum likelihood value before the transfer, then the transfer must occur; otherwise, the probability of transfer is the ratio of the transferred likelihood value to the likelihood value before transfer.

Bayesian model Average

When the MCMC algorithm converges, there are many ways to get the final model. For example, select one randomly, select the model with the largest likelihood value in multiple samples, and the model averaging of multiple samples is obtained. Random selection risk is too large, according to maximum likelihood selection model easily lead to overfitting; in Bayesian methods, the expectation of the probability distribution of the parameter to be estimated is usually chosen as the point estimate of the parameter (the mean of the sampled sample). Here, the author uses a technique called majority consensus tree, which is often used to reconstruct a conservative system tree by mixing multiple systems with trees (using different algorithms).

4. Data Zachary ' s Karate Club

This is a network of social relationships with 34 vertices and 78 edges. The edge of this network represents a mutual understanding among members of a karate club in a university.

The NCAA Schedule Network

This network consists of 115 vertices and 613 edges. The apex represents a college football team, which represents a match between the football teams during the 2000 season. Both networks are standard test sets for graph clustering algorithms.

5. Experiments and conclusions

Using this algorithm, it is possible to find the community attributes (consistent with the labels labeled by scientists) that exist in the data that have been discovered by social scientists (supervised evaluation methods). Compared with other graph clustering algorithms, this algorithm can correctly classify nodes at the edge of a community. In addition, the derived hierarchy model can be used to generate more samples of the graph data (similar to the original network data), then analyze some statistical features of nodes and edges, and then annotate the edges and vertices. By commenting on edges and vertices, you can also find the vertices and edges of the exception.

6. Further study

The MCMC method in the inference algorithm can be replaced by the variational inference algorithm. It has been an article to introduce the improvement of this paper on dynamic networks and the efficiency of using variational inference to improve the inference of hierarchical structure.

Reference documents

[1] clauset A, Moore C, Newman M E J. Structural inference of hierarchies in Networks[j]. Lecture Notes in Computer Science, 2006, 4503:1-13.

Structural inference of hierarchies in Networks (Network hierarchy Inference)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.