Reference: "Data exploration R language Combat" P65-p68
install.packages("rattle") # 获取实验数据集install.packages("ellipse") # 获取构建相关图的函数plotcorr
rm(list = ls())library("ellipse") # 加载包library("rattle")
data (weather) # load dataset Head (weather) # view datasets
# # Date Location mintemp maxtemp rainfall evaporation sunshine## 1 2007-11-01 Canberra 8.0 24.3 0.0 3.4 6.3## 2 2007-11-02 Canberra 14.0 26.9 3.6 4.4 9.7## 3 2007-11-03 Canberra 13.7 23.4 3.6 5.8 3.3## 4 2007-11-04 Canberra 13.3 15.5 39.8 7.2 9.1## 5 2007-11-05 Canberra 7.6 16.1 2.8 5.6 10.6## 6 2007-11-06 Canberra 6.2 16.9 0.0 5.8 8 .2## windgustdir windgustspeed winddir9am winddir3pm windspeed9am## 1 NW-SW NW 6## 2 ENE-E W 4## 3 NW-N NNE 6## 4 NW WNW W 30## 5 SSE Se ESE 20## 6 se, se E 20## windspeed3pm humidity9am H umidity3pm pressure9am PRESSURE3PM cloud9am## 1 20 68 29 1019.7 1015.0 7## 2 17 80 36 1012.4 1008.4 5## 3 6 82 69 1009.5 1007.2 8## 4 24 62 56 1005.5 1007.0 2## 5 28 68 49 1018 .3 1018.5 7## 6 1023.8 1021.7 7## cloud3pm temp9am Tem p3pm raintoday risk_mm raintomorrow## 1 7 14.4 23.6 No 3.6 yes## 2 3 17.5 25 .7 Yes 3.6 yes## 3 7 15.4 20.2 Yes 39.8 yes## 4 7 13.5 14.1 Yes 2.8 yes## 5 7 11.1 15.4 Yes 0.0 no## 6 5 10.9 14.8 No 0.2 no
test_data span class= "operator" ><-weather[, 12:21] # 12th to 21st column numeric cor_matrix <-cor (test_data, use = "pairwise" Span class= "Paren") # 22 variables for correlation coefficients cor_matrix # Display result
# # windspeed9am windspeed3pm humidity9am humidity3pm pressure9am## windspeed9am 1.00000000 0.47296617-0 .2706229 0.14665712-0.35633183## windspeed3pm 0.47296617 1.00000000-0.2660925-0.02636775-0.35980011## Humidity9a m-0.27062286-0.26609247 1.0000000 0.54671844 0.13572697## humidity3pm 0.14665712-0.02636775 0.5467184 1.0 0000000-0.08794614## pressure9am-0.35633183-0.35980011 0.1357270-0.08794614 1.00000000## pressure3pm-0.247952 38-0.33732535 0.1344205-0.01005189 0.96789496## cloud9am 0.10184246-0.02642642 0.3928416 0.55163264-0.15 755279## cloud3pm-0.02247149 0.00720724 0.2719381 0.51010790-0.14100043## temp9am 0.06407405-0.017766 36-0.4365506-0.25568147-0.46041819## temp3pm-0.23518635-0.18756965-0.3551186-0.58167615-0.25367375## pressure3pm cloud9am cloud3pm temp9am temp3pm## windspeed9am-0.24795238 0.10184246-0.02247149 0. 06407405-0.2351864## Windspeed3pm-0.33732535-0.02642642 0.00720724-0.01776636-0.1875697## humidity9am 0.13442050 0.39284158 0.27193809- 0.43655057-0.3551186## humidity3pm-0.01005189 0.55163264 0.51010790-0.25568147-0.5816761## Pressure9am 0.96789496 -0.15755279-0.14100043-0.46041819-0.2536738## pressure3pm 1.00000000-0.12894408-0.14383718-0.49263629-0.3454853# # cloud9am-0.12894408 1.00000000 0.52521793 0.02104135-0.2023440## cloud3pm-0.14383718 0.52521793 1.000000 XX 0.04094519-0.1728142## temp9am-0.49263629 0.02104135 0.04094519 1.00000000 0.8444058## temp3pm-0.3454 8531-0.20234405-0.17281423 0.84440581 1.0000000
col <- 1:10 # 填充颜色plotcorr(cor_matrix, col = col, type = "lower", diag = F)
The stronger the correlation, the narrower the circle, the left tilt (\) indicates a negative correlation, and the right skew (/) indicates positive correlation, such as temp3pm and temp9am.
# numbers = T, diag = Tplotcorr(cor_matrix, numbers = T, type = "lower", diag = T)
Col Sets ellipse fill color
Type settings display upper triangle, lower triangle, all display (upper, lower, full)
Diag logical value, whether the main diagonal is displayed
Numbers logical value, whether to replace ellipse with correlation coefficient value, the value will increase 10 times times rounding
Using Elliipse to do related graphs