-kmeans Clustering of image similarity calculation

Source: Internet
Author: User

About image similarity, mainly including color, brightness, texture similarity, more intuitive similarity matching is histogram matching. The histogram matching algorithm is simple, but is affected by the brightness, noise and so on. The other method is to extract the image features and to calculate the similarity based on the features, and the SIFT features of the extracted images are common. The SIFT feature similarity of two images is computed. For different image types, you can also use different features, for example, human face recognition network can be used to extract face characteristics vector. This paper introduces the method of extracting human face feature vectors by LIGHTCNN and image clustering.

First extract the image eigenvector, because it is the human face image clustering, so the use of LIGHTCNN extract the image feature vector, GitHub code and model download: Https://github.com/AlfredXiangWu/LightCNN.

After downloading the code and the model, run, extract_features.py extract the feature vectors of the image, Lightcnn extract the eigenvectors as 256 dimensions.

The feature vectors of all the pictures are saved in a matrix, the feature matrix. For example, there are 10 images, the feature matrix is 10x256 10x256 10\times256 matrix.

The eigenvectors are then clustered using Kmeans:

Feature_matrix = Np.asarray (cpickle.load (' features.pkl ', ' RB '), ' RB ')))


num_clusters = 6
Km_cluster = Kmeans (N_clusters=num_clusters, max_iter=300, tol=1e-10,n_init=40, \
                    init= ' k-means++ ', algorithm= ' full ', n_jobs =-1)
result = Km_cluster.fit (Feature_matrix)

Feature_matrix is the characteristic matrix, the image is clustered into the Num_clusters class, result is the clustering results, including cluster category, cluster center and so on.

About the image clustering results show that, because each image size is different, and the image feature vector is 256 dimensions, can not be directly displayed. Therefore, PCA is used to reduce the eigenvector. Specifically, the 256-D feature vectors are reduced to 2-dimensional, that is, coordinate point x, Y, so that the clustering results can be visualized in two-dimensional space.

PCA = PCA (n_components=2)             #输出两维
newdata = Pca.fit_transform (feature_matrix)   #载入N维

Finally, the coordinate points of each image are labeled in different colors according to the cluster category, thus visualizing:

x1 = []
y1 = []
x2 = []
y2 = []
x3 = []
y3 = []
x4 = []
y4 = []
x5 = []
y5 = []

LA Bels=result.labels_ for
i in Xrange (Len (Feature_matrix)):
    if labels[i] = = 0:
        x1.append (newdata[i][0])
        y1.append (newdata[i][1])
    elif labels[i] = = 1:
        x2.append (newdata[i][0])
        y2.append (newdata[i][ 1])
    elif labels[i] = = 2:
        x3.append (newdata[i][0])
        y3.append (newdata[i][1])
    elif labels[i] = = 3:
        x4.append (newdata[i][0])
        y4.append (newdata[i][1])
    elif labels[i] = = 4:
        x5.append (newdata[i][ 0])
        y5.append (newdata[i][1])
# Four colors Red Green blue black
plt.plot (x1, y1, ' or ')
plt.plot (x2, y2, ' og ')
Plt.plot (x3, y3, ' ob ')
plt.plot (x4, Y4, ' OK ')
Plt.plot (X5, Y5, ' Om ')

plt.show ()

Clustering effect:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.