Give you some data, their attributes have been digitized, and you're clustering them like that?
That's what K did.
In fact, as long as a piece of data is digitized, you can think of them as a point. We look at these records as a point group.
We classify some things, in fact, to see their distance, is the so-called birds of a feather flock together.
For example, we want to cluster them into 3 categories, we can randomly select three different points, and then each test data with the three points of distance, this test data and three points at which distance, it belongs to which category. Once the iteration is complete, it is divided into three groups. However, it is not scientific to take the result of one iteration at a time, because the three points are randomly chosen by you. You can iterate again, this time choose three points, this time the three points how to choose?
In this way, was not divided into three groups (A,B,C)? If these records have only two latitude (x,y), we add the x-coordinate of all the points of Group A (n) to/n,y coordinates. This creates a new point A1, and so does the other group, which then forms a new three-point a1,b1,c1. And then let these records and one by one to the three points of distance, and which point recently classified as what kind. After this iteration, there are three more groups, if you want better results, more iterations, until the three points will no longer change, this time will produce the optimal clustering results.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.