Python OpenCV using Notes (12) (Image segmentation of the K-mean algorithm)

Source: Internet
Author: User

K mean (Kmeans) clustering is one of the simplest clustering methods that classify data directly according to the distance between data points (Euclidean distance, geometric distance, and so on), and when all data points belong to a class that is not changing, the cluster is completed. Detailed principles can be indexed below a blog:

Cluster analysis note-K mean MATLAB algorithm (i)

About Kmeans to talk about a few more:

    1. Important point: The problem of the number of clusters. Some clustering, classification problems have been limited to clustering into several categories, that is, the number of clusters is certain, then this clustering is usually simple, direct rule of the number of clusters is good. And some clustering problems do not know how to divide into several types of good, this time how to do? Then it is necessary to find an evaluation index to evaluate how many classes of clustering are best. For example, when the cluster is finished, you can calculate the distance between the center point of the class and the distance between all the data in the class, as a criterion, such as the DBI Classification Evaluation Index is based on this. There is also a classification: cluster classification, which is also a classification when the number of classifications is not known.
    2. About the determination of the initial cluster center. Kmeans classification is actually convergent, so the initial clustering center is not very important, as long as the start is not the same. Since each iteration is completed, the new cluster Center is averaged from all the points of the new classification, so it is no longer subject to the initial cluster center. Just say that the clustering center will affect the number of iterations, at the current speed of the computer, the impact of this time is negligible.
    3. About iteration termination conditions. Iterative termination conditions can be many, but the core is how to determine that the classification is basically convergent, or very close to the optimal solution, convergence is not moving. In general, you can set the number of iterations as the termination condition, you can also set all the cluster center in the upper and lower two iterations of the process is considered convergent, or the distance difference between the center of the class of two iterations is less than a small constant is considered convergent

About the Kmean algorithm under OPENCV, the function is Cv2.kmeans ()
The format of the function is: Kmeans (data, K, Bestlabels, criteria, attempts, flags)
(1) Data: Classification data, preferably NP.FLOAT32 data, each feature in a column. The reason why this is np.float32 is that this data type is computationally fast, and the same data will slow you down if it is a uint type of data.
(2) K: The number of OPENCV2, the Kmeans classification of the known classification is required.
(3) Bestlabels: Pre-defined category tags: no, none
(4) Criteria: Iterative stop mode selection, which is a tuple of three elements. Format is (Type,max_iter,epsilon)
Of these, type has two options:
-–cv2. Term_criteria_eps: Accuracy (error) satisfies epsilon stop.
--Cv2. Term_criteria_max_iter: The number of iterations exceeds Max_iter stop.
--cv2. Term_criteria_eps+cv2. Term_criteria_max_iter, the two fit, any one satisfies the end.
(5) Attempts: Repeat test Kmeans algorithm number of times, will return the best results
(6) Flags: Initial class Center selection, two methods
Cv2. Kmeans_pp_centers; Cv2. Kmeans_random_centers

This function is used to classify grayscale images. The first thing to understand is that the input data is transformed into one dimension. Because we are clustering the entire image, so their grayscale values belong to a feature (dimension), and the image is a two-dimensional, so it is not directly into the data input, it is necessary to convert the image into a long chain of one-dimensional data. We say data ends with one column for each feature, and the grayscale value of the grayscale image clustering is a feature. If it is a color image clustering, then this time need to separate RGB three channels into one-dimensional line. Finally, when the classification results are displayed in the form of images, it is necessary to change the long chain labels back to the line. The detailed code is as follows:

ImportCv2ImportNumPy asNpImportMatplotlib.pyplot asPltimg = Cv2.imread (' woman.jpg ',0)#image read be ' Gray 'Plt.subplot (121), Plt.imshow (IMG,' Gray '), Plt.title (' original ') Plt.xticks ([]), Plt.yticks ([])#change img (2D) to 1DIMG1 = Img.reshape ((img.shape[0]*img.shape[1],1)) Img1 = Np.float32 (IMG1)#define Criteria = (Type,max_iter,epsilon)Criteria = (CV2. Term_criteria_eps + CV2. Term_criteria_max_iter,Ten,1.0)#set Flags:hou to choose the initial center#---Cv2. Kmeans_pp_centers; Cv2. Kmeans_random_centersFlags = Cv2. Kmeans_random_centers# Apply KmenasCompactness,labels,centers = Cv2.kmeans (IMG1,4,None, Criteria,Ten, flags) Img2 = Labels.reshape ((img.shape[0],img.shape[1])) Plt.subplot (122), Plt.imshow (Img2,' Gray '), Plt.title (' Kmeans ') Plt.xticks ([]), Plt.yticks ([])

This is the result of setting up into 4 classes.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Python OpenCV using Notes (12) (Image segmentation of the K-mean algorithm)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.