In 1965, Professor Chad of the University of California, Berkeley, first proposed the concept of ' set '. After more than 10 years of development, fuzzy set theory has been applied to various practical applications. In order to overcome the disadvantage of the classification, a clustering analysis based on fuzzy set theory is presented. Fuzzy clustering analysis is used to analyze the cluster. The FCM (Fuzzy C-means) algorithm is an algorithm that determines the degree to which each data point belongs to a cluster by membership degree. This clustering algorithm is an improvement of traditional hard clustering algorithm.
Algorithm Flow:
- Standardized data matrix;
- The fuzzy similarity matrix is established and the membership matrix is initialized.
- The algorithm begins to iterate until the target function converges to a minimum value;
- Based on the iteration results, the final membership matrix determines the class to which the data belongs, showing the final clustering results.
Advantage: Compared to the previous "hard cluster", the FCM method calculates the membership of each sample for all classes, which gives us a method of calculating the reliability of the sample classification results, and if a sample has an absolute advantage over the membership degree of a class in all classes, it is a very safe way to classify the sample into this class. Conversely, if the membership of the sample in all classes is relatively average, then we need other means to classify it.
Cons: KNN's shortcomings basically it has
Fuzzy Clustering
- Cluster::fanny
- E1071::cmeans
Cluster::fanny
Requires R installation package
Install.packages ("cluster")
Example code:
<-iris[-5]fannyz=fanny (iris2,3,metric="sqeuclidean")
Summary (FANNYZ)
Classification Distribution:
> fannyz$clustering [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 2 3 3 3 3 3 3 3 3 3 3 3 3 [66] 3 3 3 3 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 3 2 2 2 2 3 2 2 2 2 2 2 3 2 2 2 2 2 3 2 3 2 3 2 2 3 3 2 2[131] 2 2 2 3 2 2 2 2 3 2 2 2 3 2 2 2 3 2 3
Sample Membership Degree
> Fannyz$membership [, 1] [, 2] [, 3 1,] 0.996623586 0.0010720343 0.0023043797 [ 2,] 0.975852543 0.0074979471 0.0166495094 [ 3,] 0.979825922 0.0064145785 0.0137594999 [ 4,] 0.967427446 0.0101075228 0.0224650314 [ 5,] 0.994470355 0.0017679352 0.0037617094 [ 6,] 0.934574112 0.0206196544 0.0448062334 [ 7,] 0.979491667 0.0065045178 0.0140038150 [ 8,] 0.999547263 0.0001412048 0.0003115325 [ 9,] 0.930379787 0.0219024180 0.0477177955 .......
Shown: Clusplot (FANNYZ)
Results show
> table (iris$species,fannyz$clustering) 1 2 3 setosa 0 0 versicolor 0 3 virginica 37 13
E1071::cmeans
Install package code:
Install.packages ("e1071")
Example code:
> Library ("e1071")> x <-iris[-5]> Result1<-cmeans (x,3,50)>Result1fuzzy C-means Clustering with 3clusterscluster centers:Sepal.Length sepal.width petal.length petal.width1 5.888999 2.761093 4.364049 1.39736542 6.775092 3.052406 5.646882 2.05358553 5.003966 3. 414086 1.482821 0.2535487Memberships:1 2 3 [1,] 0.0023043721 0.0010720485 0.996623579 [2,] 0.0166481906 0.0074975084 0.975854301 [3,] 0.0137586391 0.0064142953 0.979827066 .......
Statistical results:
> table (iris$species,result1$cluster) 1 2 3 setosa 0 3 0 virginica 0
Example of 3d effect display
# install.packages ("Scatterplot3d") Library (Scatterplot3d) Scatterplot3d (result1$membership, Color=result1$cluster, type="H ", angle=55, scale.y=0.7, pch=16, main="pertinence")
Resources:
- http://blog.csdn.net/Alex_luodazhi/article/details/47125149
- Https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clustering/Fuzzy_Clustering_-_Fuzzy_C-means
ML: Clustering algorithm R packet-fuzzy clustering