Previous Article --- principles and implementation of K-mediod (PAM) Clustering Algorithm for Data Mining

Last Update:2014-07-21 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the previous blog, we introduced the kmeans algorithm in the clustering algorithm.

It is beyond reproach that kmeans is more efficient due to its simple algorithm and high classification efficiency.

It has been widely used in clustering applications.

However, kmeans is not perfect for the noise and

The error caused by the clustering of isolated points is also a headache.

Therefore, kmediod, an improved kmeans-based algorithm, came into being.

The core concepts of kmediod and kmeans algorithms are similar, but they are the largest

The difference is that when the clustering center is corrected, kmediod is calculated.

In a class cluster, each point of the cluster center is exclusive to all other points.

To optimize the new cluster center.

This difference makes kmediod make up for the shortcomings of the kmeans algorithm.

Kmediod is not sensitive to noise and isolated points.

However, things have two sides. The improvement of clustering accuracy is to sacrifice clustering.

Time to achieve. It is not difficult to see that kmediod needs to constantly find every point

The minimum distance of all other points is used to modify the cluster center, which greatly increases

Clustering convergence time. All kmediod for large-scale Data Clustering

It seems powerless and can only adaptSmall Scale.

Next I will describe the kmediod algorithm again:

1. Set the sample to X {X (1), X (2 )........}

2. First, K cluster centers are randomly selected in the sample.

3. Calculate the distance from the sample points out of the cluster center to each cluster center.

Samples are classified into the sample points closest to the sample center. This achieves the initial clustering.

4. Calculate the distance and minimum value of all other points for the sample points except the points in the center of the class in each class.

The minimum value is used as the new cluster center to achieve a clustering optimization.

5. Repeat Step 4 until the location of the cluster center does not change twice, which completes the final clustering.

Note: Step 4 shows the core differences between kmeans and kmediod.

The MATLAB implementation code of K-mediod is as follows:

CLC; clear; clomstatic = [,]; Len = length (clomstatic); % evaluate the length of the vector clomstatic K = 3; % The number of given classes % generates three random integers. the random cluster center P = randperm (LEN); temp = P (1: K); Center = zeros (1, k ); for I = 1: K Center (I) = clomstatic (temp (I); end % calculates the distance from the sample data except the cluster center to the cluster center, then perform the clustering tempdistance = zeros (Len, 3); While 1 circulm = 1; P1 = 1; P2 = 1; P3 = 1; judgeequal = zeros (1, k ); if (circulm ~ = 1) Clear group1 group2 Group3; end for I = 1: Len for J = 1:3 tempdistance (I, j) = ABS (clomstatic (I)-center (j )); end [rowmin rowindex] = min (tempdistance (I, :)); If (rowindex = 1) group1 (P1) = clomstatic (I); P1 = p1 + 1; elseif (rowindex = 2) group2 (P2) = clomstatic (I); P2 = P2 + 1; elseif (rowindex = 3) Group3 (P3) = clomstatic (I ); p3 = P3 + 1; end len1 = length (group1); len2 = length (group2); len3 = length (Group3); % calculate group1, group2, Mean meangroup1 = mean (group1); meangroup2 = mean (group2); meangroup3 = mean (Group3 ); % calculate the distance from the center of each class to all other points and E, and E is the new cluster center of this class. E = zeros (1, len1-1); Q1 = 1; for j = 1: len1 for I = 1: Len if (group1 (j )~ = Center (1) & I ~ = J) E (Q1) = floor (ABS (group1 (j)-clomstatic (I); Q1 = Q1 + 1; end newcenter (1) = min (e); E = zeros (1, len2-1); Q2 = 1; for j = 1: len2 for I = 1: Len if (group2 (j )~ = Center (2) & I ~ = J) E (Q2) = floor (ABS (group2 (j)-clomstatic (I); Q2 = Q2 + 1; end newcenter (2) = min (e); E = zeros (1, len3-1); Q3 = 1; for j = 1: len3 for I = 1: Len if (Group3 (j )~ = Center (3) & I ~ = J) E (Q3) = floor (ABS (Group3 (j)-clomstatic (I); Q3 = Q3 + 1; end newcenter (3) = min (E); % determines whether the clustering centers of the New and Old classes are different. If they are different, continue clustering; otherwise, the cluster ends judgeequal = zeros (1, k ); for I = 1: K judgeequal = (newcenter = center); End S = 0; for I = 1: K if (judgeequal (I) = 1) S = S + 1; end if (S = 3) break; end circulm = circulm + 1; End

The result is as follows:

Reprinted by Liu

Previous Article --- principles and implementation of K-mediod (PAM) Clustering Algorithm for Data Mining

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Previous Article --- principles and implementation of K-mediod (PAM) Clustering Algorithm for Data Mining

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Previous Article --- principles and implementation of K-mediod (PAM) Clustering Algorithm for Data Mining

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support