Data format: Time (hours)/data for each province
Data input:
Locdata <-read.csv ("./data.csv")
At this point the type of Locdata is: data.frame
Need to convert to Matric type:
Locdata_m <-As.matrix (locdata)
Reject data:
LOCDATA_MT <-locdata_mt[c (2:nrow (LOCDATA_MT)), C (1:ncol (LOCDATA_MT))]
Next, you start the cluster operation:
In two steps:
1, determine the calculation distance mode
Dist (x, method = "Euclidean", Diag = False, upper = false, P = 2)
The R language uses dist (x, method = "Euclidean", Diag = False, upper = false, P = 2) to calculate the distance. where x is the sample matrix or data frame.
method means which distance is calculated. The value of method is:
Euclidean euclidean distance, is square re-prescribing.
Maximum Chebyshev distance
Manhattan Absolute Distance
Canberra Lance distance
Minkowski Minkovski distance, use to specify P-value
binary qualitative variable distance
(Scale (x, center = true, scale = true) can be used to center and standardize the data before the distance is calculated. )
2. Determine the Clustering method
Hclust (d, method = "complete", Members=null)
Method represents the merging methods of the class, which are:
Single Shortest distance method
Complete the longest distance method
Median intermediate distance method
McQuitty Similarity method
Average class averaging method
Centroid Center of gravity method
Ward Deviation squared sum method
D <-Dist (LOCDATA_MT, method = *)
HCL <-Hclust (d, method = *)
Finally, draw the way the root node does it:
Plot (HCL, hang =-1)
重合度图形展示:
Heatmap (As.matrix)
R Cluster Analysis