Using R to complete Kmeans clustering

Source: Internet
Author: User

Using R to complete Kmeans clustering requires calling the Kmeans method, using the dataset Iris to complete a small clustering experiment with the following code:

Newiris <-Iris;
Newiris$species <-NULL;  #对训练数据去掉分类标记
KC <-Kmeans (Newiris, 3);  #分类模型训练
fitted (KC);  #查看具体分类情况
table (iris$species, kc$cluster);  #查看分类概括

#聚类结果可视化 
plot (newiris[c ("Sepal.length", "Sepal.width"), col = kc$cluster, pch = As.integer (iris$ species));  #不同的颜色代表不同的聚类结果, the different shapes represent the original classification of the training data set.
points (Kc$centers[,c ("Sepal.length", "Sepal.width")], col = 1:3, pch = 8, cex=2);
A visualization of the cluster results is shown below



There is a very good example in the R Help document, as follows, paying special attention to the conditions that the Kmeans method satisfies:

Require (graphics) # A 2-dimensional example x <-rbind (Matrix (rnorm (SD = 0.3), Ncol = 2), Matrix (rnorm (mean = 1, SD = 0.3), Ncol = 2)) colnames (x) <-C ("x", "Y") (CL <-Kmeans (x, 2)) plot (x, col = cl$cluster) poin TS (cl$centers, col = 1:2, pch = 8, CeX = 2) # Sum of squares # where the scale function provides data-centric functionality, the so-called centralization of data refers to the data in the dataset minus the mean value of the data set, which also provides the data Quasi-function, the so-called standardization of data refers to the data after centralization is divided by the standard deviation of the data set, that is, the data set in the dataset minus the average value of the dataset divided by the standard deviation of the dataset.
See http://it.zhans.org/10/1834.htm.
SS <-function (x) sum (scale (x, scale = FALSE) ^2) # # Cluster Centers "fitted" to each obs.: fitted.x <-fitted (CL);
Head (fitted.x);

Resid.x <-x-fitted (CL); # # Equalities:----------------------------------cbind (Cl[c ("Betweenss", "Tot.withinss", "TOTSS")], # The same of Colu MNS C (SS (Fitted.x), SS (Resid.x), SS (x)) # Kmeas clustering satisfies the following conditions Stopifnot (all.equal (cl$ totss, SS (x)), all. Equal (cl$ tot.withinss, SS (Resid.x)), # # These three is the same:all.equal (cl$ betweenss, SS (Fitted.x)), all . eqUAL (cl$ betweenss, Cl$totss-cl$tot.withinss), # # and hence also all.equal (SS (x), SS (Fitted.x) + SS (resid.x))) 
A visualization of the cluster results is shown below



Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.