# R Cluster Analysis

Data format: Time (hours)/data for each province

Data input:

At this point the type of Locdata is: data.frame

Need to convert to Matric type:

Locdata_m <-As.matrix (locdata)

Reject data:

LOCDATA_MT <-locdata_mt[c (2:nrow (LOCDATA_MT)), C (1:ncol (LOCDATA_MT))]

Next, you start the cluster operation:

In two steps:

1, determine the calculation distance mode

Dist (x, method = "Euclidean", Diag = False, upper = false, P = 2)

The R language uses dist (x, method = "Euclidean", Diag = False, upper = false, P = 2) to calculate the distance. where x is the sample matrix or data frame.

method means which distance is calculated. The value of method is:
Euclidean euclidean distance, is square re-prescribing.
Maximum Chebyshev distance
Manhattan Absolute Distance
Canberra Lance distance
Minkowski Minkovski distance, use to specify P-value
binary qualitative variable distance

(Scale (x, center = true, scale = true) can be used to center and standardize the data before the distance is calculated. )

2. Determine the Clustering method

Hclust (d, method = "complete", Members=null)

Method represents the merging methods of the class, which are:
Single Shortest distance method
Complete the longest distance method
Median intermediate distance method
McQuitty Similarity method
Average class averaging method
Centroid Center of gravity method
Ward Deviation squared sum method

D <-Dist (LOCDATA_MT, method = *)

HCL <-Hclust (d, method = *)

Finally, draw the way the root node does it:

Plot (HCL, hang =-1)

`重合度图形展示：`

Heatmap (As.matrix)

