MATLAB implementation of K-means Clustering algorithm

Last Update:2018-07-26 Source: Internet

Author: User

Tags abs min

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The principle of clustering and classification in data mining is widely used.

clustering is unsupervised learning.

The classification is supervised learning.

the popular point is: Before clustering is the classification of unknown samples. It is divided into similar clusters based on the similarity of the sample itself .

The classification is a known sample classification, you need to match the sample features and classification features, and then each sample into the given class.

because this paper is the realization of the K-means algorithm in the clustering algorithm, some clustering algorithms are introduced in the next chapter.

The clustering algorithm consists of many types, which can be assigned as follows:

1. Partitioning method: The clustering algorithm based on this idea includes K-means,pam,clara,clarans,stirr.

2. Hierarchical approach: Clustering algorithms based on this idea include Birch,cure,rock,chamlean

3. Density method: The clustering algorithm based on this idea includes Dbscan,optics,denclue,fdbscan,incremental DBSCAN

4. Grid method: The clustering algorithm based on this idea includes Sting,wavecluster,optigrid

5. Model method: A clustering algorithm based on this idea includes Autoclass,cobweb,classit

6. Neural networks: There are two kinds of clustering algorithms based on the idea network: one self-organizing feature mapping

Second, competitive learning

and K-means is based on the division of thought. So here is the idea of dividing clusters:

1. First randomly determine the K cluster centers for a set of sample data

2. The clustering centers were subsequently changed through iterative iterations, making continuous optimization. And the constant optimization means:

the same kind of sample is closer to the cluster center, and the distance between the samples is more and more distant.

and the position that eventually converges to the center of the cluster no longer moves.

since K-means is based on such a division of thought, then of course, the K-means of the algorithm thought Essence and division of ideas are consistent.

The K-means algorithm is as follows:

1. Set the sample to X{x (1), X (2) ...}

2. First randomly select K cluster centers in the sample.

3. The distance to each cluster center is then computed for the sample points outside the cluster center.

categorize the sample to the nearest sample point at the center of the sample. This enables the initial clustering

4. Re-compute the cluster Center for each class. Then, the distance from the sample points to the three cluster centers is recalculated.

classifying the sample to the nearest sample point at the center of the sample enables the first optimization of clustering.

5. Repeat step four until the location of the two-time cluster center is no longer changing, which completes the final cluster

K-means MATLAB is implemented as follows: (k=3)

CLC

Clear
CLOMSTATIC=[1,2,3,25,26,27,53,54,55]; Len=length (clomstatic); the length of the vector clomstatic is k=3;
% of the given number of classes produces three random integers, random cluster center p=randperm (len);
Temp=p (1:K);
Center=zeros (1,K);
For I=1:k Center (i) =clomstatic (Temp (i));
 End% calculates the distance to the cluster center of the sample data except for the cluster center, and then the cluster Tempdistance=zeros (len,3);
    
    While 1 circulm=1;
    P1=1;
    P2=1;
    
    P3=1;
    Judgeequal=zeros (1,K);   
    if (circulm~=1) clear Group1 Group2 Group3;
        End for I=1:len for J=1:3 tempdistance (i,j) =abs (clomstatic (i)-center (j));
        End [Rowmin rowindex]=min (Tempdistance (i,:));
            if (rowindex==1) Group1 (p1) =clomstatic (i);
        p1=p1+1;
            ElseIf (rowindex==2) Group2 (p2) =clomstatic (i);
        p2=p2+1;
            ElseIf (rowindex==3) Group3 (p3) =clomstatic (i);
        p3=p3+1;
        End End Len1=length (GROUP1);
        Len2=length (GROUP2);
        
        
        Len3=length (GROUP3); % calculationGroup1,group2,group3 mean value of Meangroup1=mean (GROUP1);
        Meangroup2=mean (GROUP2);

        
        Meangroup3=mean (GROUP3);
        % The nearest point of distance mean in each class is calculated as the new cluster Center Absgroup1=zeros (1,LEN1);
        For T=1:len1 AbsGroup1 (t) =floor (ABS (Group1 (t)-meangroup1));
        End [MaxAbsGroup1 maxabsgroup1index]=min (ABSGROUP1);
        Newcenter (1) =group1 (Maxabsgroup1index);

        Clear AbsGroup1;
        Absgroup2=zeros (1,LEN2);
        For T=1:len2 AbsGroup2 (t) =floor (ABS (GROUP2 (t)-meangroup2));
        End [MaxAbsGroup2 maxabsgroup2index]=min (ABSGROUP2);
        Newcenter (2) =group2 (Maxabsgroup2index);
          
        Clear AbsGroup2;
        Absgroup3=zeros (1,LEN3);
        For T=1:len3 AbsGroup3 (t) =floor (ABS (GROUP3 (t)-meangroup3));
        End [MaxAbsGroup3 maxabsgroup3index]=min (ABSGROUP3);
        Newcenter (3) =group3 (Maxabsgroup2index);
        
        Clear AbsGroup3;
   % to determine whether the new class and the old class cluster Center are different, and then continue clustering, otherwise the cluster ends     Judgeequal=zeros (1,K);
        For I=1:k judgeequal= (newcenter==center);
        End S=0;
            For I=1:k if (judgeequal (i) ==1) s=s+1;
        End End If (s==3) break;
  End circulm=circulm+1;

 End

The clustering results are as follows:

not until the convergence of the algorithm or the problem of the code itself, after the run will continue to run, at this time by pressing CTRL + C interrupt code, the cluster results come out.

If there is a great God found the reason also hope to enlighten younger brother, many thanks.

Reprint please indicate the author: Xiao Liu

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More