Community Discovery Algorithm Problem &&networkx&&gephi

Source: Internet
Author: User

I used community discovery when I was doing something, so I learned some questions about community discovery.

1, Community discovery algorithm

(1)SCAN: A density-based community discovery algorithm

Paper: "Scan:a Structural Clustering Algorithm for Networks" Auther:xiaowei Xu, Nurcan Yuruk, Zhidan Feng, Thomas A. J. Schweiger CONFERENCE:SIGKDD 2007

Key concepts:

  • The node similarity is defined as the ratio of the number of common neighbors of two nodes to the geometric average of the number of neighbors of the two nodes (where the neighbor contains the node itself).
  • node -Neighbors are defined with a similarity of not less than A collection of nodes of the
  • A nuclear node is a node in which The number of neighbors is greater than μ .
  • NodeWIt's a nuclear node. v neighbor, then called from v directly up to w.
  • NodevUp toWWhen and only if there is a node chainV1, ... ,vn∈v, v1=v, vn =w Makes vi+1 is directly accessible from vi .
  • If the core node u can reach node V and node W , the node v and Node W are connected.

Specific algorithm:

  • for each unassigned Community node v , check if v is a nuclear node, and the kernel node assigns its direct-to-node to a community (the association label is credited to that node) and ? -Neighbors put in the queue, repeat 1 steps (similar to the direct access to the node into theYesDFS).
  • if v is not a nuclear node, it is marked as non-member.
  • Finally check all the Non-menber nodes, if their neighboring nodes exist in two or more societies, it is labeled as the Hub node, otherwise marked as outlier.

Algorithm SCAN (G=<v, e>, ε,μ)//All vertices in V is labeled as unclassified; forEach unclassified vertex v∈v Do//STEP 1. Check whether v is a core;    ifcoreε,μ (v) Then//STEP 2.1. If V is a core, a new cluster is expanded;GenerateNewClusterid;        Insert All x∈nε (v) into queue Q;  whileQ≠0  Doy= First VertexinchQ; R= {X∈v |dirreachε,μ (y, x)};  forEach x∈r Do                ifX isUnclassified or non-member then assign current Clusterid to X; ifX isunclassified then insert X into queue Q; Remove Y fromQ; Else//STEP 2.2. If V is not a core, it is labeled as Non-memberLabel V asnon-Member;end for.//STEP 3. Further classifies Non-members forEach non-member vertex v Do    if(? x, Y∈γ (v) (X.clusterid≠y.clusterid) then label v asHubElseLabel v asOutlier;end for. End SCAN.

(2) Complex network Community structure discovery algorithm-based on Python networkx clique infiltration algorithm

Paper:g. Palla, I. Derényi, I. Farkas, and T. Vicsek, "Uncovering the overlapping community structure of complex networks In Nature and society, "Nature, Vol. 435, pp. 814-818, 2005.

Clique Infiltration Algorithm Introduction:

For a graph G, if there is a complete sub-graph (an edge exists between any two nodes) and the number of nodes is K, then this complete sub-graph can be called a k-clique.

Furthermore, if there are k-1 common nodes between the two k-clique, then the two clique are said to be "adjacent" to each other. Such a string of clique, adjacent to each other, constitutes the largest set, which can be called a community (and such a community can overlap, the so-called overlapping community, which means that some nodes can belong to multiple communities at the same time). The following first photo shows that two 3-clique form a community, and the second is an overlapping community.

2.NetWorkX installation Use and examples

Networkx is a graph theory and complex network modeling tool developed in Python language, which is built with commonly used graphs and complex network analysis algorithms, which can facilitate complex network data analysis and simulation modeling. This paper mainly introduces the clique infiltration algorithm,

(1) Installation

The first is the software, https://pypi.python.org/pypi/networkx/, I downloaded the WHL file

Then is the installation of WHL files, the specific installation process on the Internet can refer to this Windows7 how to install WHL file (python),

Then Ann turned to see how to execute the algorithm, I used the Pycharm IDE, I performed the clique infiltration algorithm, below is the main implementation of this algorithm

"" "Find k-clique communities in graph using the Percolation method.

A K-clique Community is the union of all cliques of size K so can be reached through adjacent (sharing k-1 nodes) K-cliq UEs.
Parameters:

K (int) –size of Smallest clique
Cliques (list or generator) –precomputed cliques (use Networkx.find_cliques (G))

Return Type:

Yields sets of nodes, one for each K-clique community.
"""
ImportNetworkx as NXImportSYSImport Timedeffind_community (graph,k):returnlist (nx.k_clique_communities (graph,k)) G=NX. Graph ()##testFile =open ("F://2.txt", "R") # #2-6Testfile=open ("F://21.txt","R")##5, ten forLineinchtestfile:a=line.strip ('\ n'). Split (",") G.add_edge (a[0],a[1]) forKinchRange (5,10): Print("############# k Value:%d ################"%k) Start_time=time.clock () rst_com=find_community (g,k) end_time=Time.clock ()Print("calculation time-out (seconds):%.3f"% (end_time-start_time)) Print("number of communities generated:%d"%Len (rst_com))Print(rst_com)

Where the format of the file is

A,bc,da,ca,d

3.gephi

Gephi is an open source, free, cross-platform, JVM-based, complex network analysis software designed for interactive visualization and exploration of open source tools for a variety of networks and complex systems, dynamic and hierarchical diagrams.

gephi.org, you can download this software for free on the official website. at present Gephi already has the Chinese course, the website is: Udemy.com/gephi.

Reference: http://blog.csdn.net/DawnRanger/article/details/51108433

Community Discovery Algorithm Problem &&networkx&&gephi

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.