Python OpenCV using Notes (12) (Image segmentation of the K-mean algorithm)

Last Update:2015-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

K mean (Kmeans) clustering is one of the simplest clustering methods that classify data directly according to the distance between data points (Euclidean distance, geometric distance, and so on), and when all data points belong to a class that is not changing, the cluster is completed. Detailed principles can be indexed below a blog:

Cluster analysis note-K mean MATLAB algorithm (i)

About Kmeans to talk about a few more:

Important point: The problem of the number of clusters. Some clustering, classification problems have been limited to clustering into several categories, that is, the number of clusters is certain, then this clustering is usually simple, direct rule of the number of clusters is good. And some clustering problems do not know how to divide into several types of good, this time how to do? Then it is necessary to find an evaluation index to evaluate how many classes of clustering are best. For example, when the cluster is finished, you can calculate the distance between the center point of the class and the distance between all the data in the class, as a criterion, such as the DBI Classification Evaluation Index is based on this. There is also a classification: cluster classification, which is also a classification when the number of classifications is not known.
About the determination of the initial cluster center. Kmeans classification is actually convergent, so the initial clustering center is not very important, as long as the start is not the same. Since each iteration is completed, the new cluster Center is averaged from all the points of the new classification, so it is no longer subject to the initial cluster center. Just say that the clustering center will affect the number of iterations, at the current speed of the computer, the impact of this time is negligible.
About iteration termination conditions. Iterative termination conditions can be many, but the core is how to determine that the classification is basically convergent, or very close to the optimal solution, convergence is not moving. In general, you can set the number of iterations as the termination condition, you can also set all the cluster center in the upper and lower two iterations of the process is considered convergent, or the distance difference between the center of the class of two iterations is less than a small constant is considered convergent

About the Kmean algorithm under OPENCV, the function is Cv2.kmeans ()
The format of the function is: Kmeans (data, K, Bestlabels, criteria, attempts, flags)
(1) Data: Classification data, preferably NP.FLOAT32 data, each feature in a column. The reason why this is np.float32 is that this data type is computationally fast, and the same data will slow you down if it is a uint type of data.
(2) K: The number of OPENCV2, the Kmeans classification of the known classification is required.
(3) Bestlabels: Pre-defined category tags: no, none
(4) Criteria: Iterative stop mode selection, which is a tuple of three elements. Format is (Type,max_iter,epsilon)
Of these, type has two options:
-–cv2. Term_criteria_eps: Accuracy (error) satisfies epsilon stop.
--Cv2. Term_criteria_max_iter: The number of iterations exceeds Max_iter stop.
--cv2. Term_criteria_eps+cv2. Term_criteria_max_iter, the two fit, any one satisfies the end.
(5) Attempts: Repeat test Kmeans algorithm number of times, will return the best results
(6) Flags: Initial class Center selection, two methods
Cv2. Kmeans_pp_centers; Cv2. Kmeans_random_centers

This function is used to classify grayscale images. The first thing to understand is that the input data is transformed into one dimension. Because we are clustering the entire image, so their grayscale values belong to a feature (dimension), and the image is a two-dimensional, so it is not directly into the data input, it is necessary to convert the image into a long chain of one-dimensional data. We say data ends with one column for each feature, and the grayscale value of the grayscale image clustering is a feature. If it is a color image clustering, then this time need to separate RGB three channels into one-dimensional line. Finally, when the classification results are displayed in the form of images, it is necessary to change the long chain labels back to the line. The detailed code is as follows:

ImportCv2ImportNumPy asNpImportMatplotlib.pyplot asPltimg = Cv2.imread (' woman.jpg ',0)#image read be ' Gray 'Plt.subplot (121), Plt.imshow (IMG,' Gray '), Plt.title (' original ') Plt.xticks ([]), Plt.yticks ([])#change img (2D) to 1DIMG1 = Img.reshape ((img.shape[0]*img.shape[1],1)) Img1 = Np.float32 (IMG1)#define Criteria = (Type,max_iter,epsilon)Criteria = (CV2. Term_criteria_eps + CV2. Term_criteria_max_iter,Ten,1.0)#set Flags:hou to choose the initial center#---Cv2. Kmeans_pp_centers; Cv2. Kmeans_random_centersFlags = Cv2. Kmeans_random_centers# Apply KmenasCompactness,labels,centers = Cv2.kmeans (IMG1,4,None, Criteria,Ten, flags) Img2 = Labels.reshape ((img.shape[0],img.shape[1])) Plt.subplot (122), Plt.imshow (Img2,' Gray '), Plt.title (' Kmeans ') Plt.xticks ([]), Plt.yticks ([])

This is the result of setting up into 4 classes.

Python OpenCV using Notes (12) (Image segmentation of the K-mean algorithm)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Python OpenCV using Notes (12) (Image segmentation of the K-mean algorithm)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Python OpenCV using Notes (12) (Image segmentation of the K-mean algorithm)

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support