Using spectral clustering algorithm to solve the clustering of incomplete graphs

Source: Internet
Author: User

When dealing with the clustering of incomplete graphs, it is difficult to find an effective clustering algorithm to do clustering.

For the point, the location of the 10th and 15th points is not so close, such as using ordinary clustering algorithm to do clustering, usually will be 10th points and 15th points clustered in a class, so the general clustering effect is not so good.

  

and spectral clustering , it is very good to deal with such problems.

Let's focus on spectral clustering.

 spectral Clustering (spectralclustering)is to reasonably divide the sample into two or K parts. From the perspective of graph theory, the problem of spectral clustering is equivalent to a graph segmentation problem. That is, given a graph G = (V, E), the vertex set V for each sample, the weighted edge represents the similarity between each sample, the purpose of the spectral clustering is to find a reasonable method of dividing the graph, so that after the segmentation to form a number of sub-graphs, the weight of the edge of the connection of the different child graphs (similarity) The weight (similarity) of the edges within the same child graph is as high as possible. Birds of a feather, flock together.

(i) Algorithm steps

    1. Each node of a graph,graph is constructed from data to correspond to a data point, the points are connected (and then the points that are already connected but not very similar are cut by Cut/ratiocut/ncut), and the weights of the edges are used to represent the similarity between the data. Put this graph in the form of an adjacency matrix, denoted as W.
    2. Add each column element of W to get N number, put them on the diagonal (all other places are 0), make up a diagonal matrix, take the degree matrix D, and write the result as Laplace matrix.
    3. The first K eigenvalues of L are obtained (the first K refers to the order of the eigenvalues from small to large), and the corresponding eigenvectors.
    4. The k -Feature (column) vectors are arranged together to form a matrix, each of which is considered to be a vector in the k -dimensional space and clustered using the K-means algorithm. The category that each row belongs to in the result of the cluster is the original Graph, so the node is the category where the initial N data points belong.

(b) Realization of spectral clustering

Python programming, the use of sklearn.cluster under the spectralclustering can be easily implemented

 from Import spectralclusteringlabels=spectralclustering (affinity='nearest_neighbors', n _clusters=20, n_neighbors=3). Fit_predict (Route)

#labels =

[3,  3,  5,  7,  8,  8,  9, 4, 4,,, 2,  2, 15, 16, 17, 18,
1, 2, 2, 2, 2, 2, , 9, 4, 4, 4,
+, +, +, +, 3, 5, 4, 4, 9, 9, 8, 8, 8,
8, 8, 5, 5, 8,,, 8,, 8, 7, 7, 6, 6, 6, 6, 11,
3, 3, 3, 3, 3, 3, one, one, one , 1, 1, 1, 13, 13, 13, 10, 12,
0, 0, Ten, 0, 0]


which

1) N_clusters: Represents the dimensionality we reduced to when we transduction the spectral cluster. such as divided into 20 cluster clusters, n_clusters=20.

2) Affinity: That is how we build the similarity matrix. There are three types of choices: The first is ' nearest_neighbors ', that is, K-neighbor, and the second is ' precomputed ', which is the custom similarity matrix, and the third is the full-join method, which can use various kernel functions to define similar matrices and also to customize kernel functions. Based on the clustering of the graph, the parameter selection of this paper is affinity= ' nearest_neighbors ', n_neighbors=3.

And then the picture, the perfect solution is similar to the 10th and 15th coordinates divided into one kind of question. Results such as

The data and code required for the implementation of this example have been uploaded to Github:https://github.com/yjx7/-spectralclustering.git useful for a star thank you

Using spectral clustering algorithm to solve the clustering of incomplete graphs

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.