Using spectral clustering algorithm to solve the clustering of incomplete graphs

Last Update:2017-09-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When dealing with the clustering of incomplete graphs, it is difficult to find an effective clustering algorithm to do clustering.

For the point, the location of the 10th and 15th points is not so close, such as using ordinary clustering algorithm to do clustering, usually will be 10th points and 15th points clustered in a class, so the general clustering effect is not so good.

and spectral clustering , it is very good to deal with such problems.

Let's focus on spectral clustering.

　spectral Clustering (spectralclustering)is to reasonably divide the sample into two or K parts. From the perspective of graph theory, the problem of spectral clustering is equivalent to a graph segmentation problem. That is, given a graph G = (V, E), the vertex set V for each sample, the weighted edge represents the similarity between each sample, the purpose of the spectral clustering is to find a reasonable method of dividing the graph, so that after the segmentation to form a number of sub-graphs, the weight of the edge of the connection of the different child graphs (similarity) The weight (similarity) of the edges within the same child graph is as high as possible. Birds of a feather, flock together.

(i) Algorithm steps

Each node of a graph,graph is constructed from data to correspond to a data point, the points are connected (and then the points that are already connected but not very similar are cut by Cut/ratiocut/ncut), and the weights of the edges are used to represent the similarity between the data. Put this graph in the form of an adjacency matrix, denoted as W.
Add each column element of W to get N number, put them on the diagonal (all other places are 0), make up a diagonal matrix, take the degree matrix D, and write the result as Laplace matrix.
The first K eigenvalues of L are obtained (the first K refers to the order of the eigenvalues from small to large), and the corresponding eigenvectors.
The k -Feature (column) vectors are arranged together to form a matrix, each of which is considered to be a vector in the k -dimensional space and clustered using the K-means algorithm. The category that each row belongs to in the result of the cluster is the original Graph, so the node is the category where the initial N data points belong.

(b) Realization of spectral clustering

Python programming, the use of sklearn.cluster under the spectralclustering can be easily implemented

 from Import spectralclusteringlabels=spectralclustering (affinity='nearest_neighbors', n _clusters=20, n_neighbors=3). Fit_predict (Route)

#labels =

[3,  3,  5,  7,  8,  8,  9, 4, 4,,, 2,  2, 15, 16, 17, 18,
1,  2,  2,  2,  2,  2,  , 9, 4, 4, 4,
+, +, +, +,  3,  5,  4,  4,  9, 9,  8,  8,  8,
  8,  8,  5,  5,  8,,,  8,,  8,  7,  7,  6,  6,  6,  6, 11,
  3,  3,  3,  3, 3, 3, one, one, one  ,  1,  1,  1, 13, 13, 13, 10, 12,
0,  0,  Ten,  0,  0]

which

1) N_clusters: Represents the dimensionality we reduced to when we transduction the spectral cluster. such as divided into 20 cluster clusters, n_clusters=20.

2) Affinity: That is how we build the similarity matrix. There are three types of choices: The first is ' nearest_neighbors ', that is, K-neighbor, and the second is ' precomputed ', which is the custom similarity matrix, and the third is the full-join method, which can use various kernel functions to define similar matrices and also to customize kernel functions. Based on the clustering of the graph, the parameter selection of this paper is affinity= ' nearest_neighbors ', n_neighbors=3.

And then the picture, the perfect solution is similar to the 10th and 15th coordinates divided into one kind of question. Results such as

The data and code required for the implementation of this example have been uploaded to Github:https://github.com/yjx7/-spectralclustering.git useful for a star thank you

Using spectral clustering algorithm to solve the clustering of incomplete graphs

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Using spectral clustering algorithm to solve the clustering of incomplete graphs

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Using spectral clustering algorithm to solve the clustering of incomplete graphs

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support