R Language Data Analysis series five --by Comaple.zhang
This section discusses the basic graphical presentation of the R language, so let's take a look at it.
This is an R language generated, virtual Wordcloud cloud, specific implementation details see my github project: Https://github.com/comaple/R-wordcloud.git
All right, let's start our journey today:
The packages in this note are: Rcolorbrewer used to generate sequence color values, Plotrix three-dimensional graphics
Datasets in this section: Arthritis Datasets in a VCD package
Data set
Install.packages ("VCD") Library (' VCD ') install.packages (Plotrix) #将图形包也一并安装了library (Plotrix) data (package= ' VCD ') # View all data sets of VCD package
Class (Arthritis) # View DataSet type names (arthritis) # view column name Arth <-arthritis # Copy a arth[1:10,] #查看前10行数据
Bar chart
#该数据集最后一列Improved为因子型数据. Table (arth$improved) #查看因子水平的count值col <-C (Brewer.pal (9, ' ylorrd ') [1:9]) #设置颜色序列barplot (table (arth$ Improved), col=col,xlab= ' improved ', ylab= ' count ', main= ' statisticsof improved ') #绘制柱状图
Barplot (Table (arth$improved), col=col,horiz=t,xlab= ' count ', ylab= ' improved ', main= ' statisticsof improved ') # Horizontal bar Chart
Barplot (Counts,col=col,legend=rownames (counts), width=0.1) #堆砌条形图
Barplot (Counts,col=col[1:3],legend=rownames (counts), width=0.1,beside=t) #分组条形图
Pie chart
Par (MFROW=C) # defines the landscape canvas, two-grid layout label <-C (' prime ', ' midlife ', ' elder ', ' old ') ages <-cut (arth$age,breaks=c (20,30,50,70,100), Labels=label) #将年龄数据离散化pie (table (ages), family= ' Stkaiti ') # Draw pie chart Pie (table (ages), Labels=paste (Levels (Ages), ': ') , Round (table (Ages)/sum (table (ages)) *100,2), '% '), family= ' Stkaiti ', main= ' arthritis incidence Age segment ')
Pie3d (Table (ages), labels= paste (Round (table (Ages)/sum (table (ages)) *100,2), '% '), family= ' Stkaiti ', main= ' Arthritis incidence Age (%) (explode=0.1) # 3D pie chart
Histogram
We use the Mtcars dataset for graphical plotting:
hist (mtcars$mpg,breaks=12,col=col,freq=f,xlab= ' milesper gallon ', main= ' histogram of Car gallon,density curve ') # Draw histogram Lines (Density (mtcars$mpg), color= ' Blue ', lwd=2) #添加核密度图
If you want to plot and density plots individually, you can:
Plot (density (mtcars$mpg), main= ' densityof Car gallon ')
As we can see, we have the kernel density function using a Gaussian kernel with a variance of 2.477 and a sample sampling of 32.
Box-type diagram
The box-line diagram, by drawing a five-number sum of continuous variables, that is, the minimum value (corresponding to the bottom line in the graph), the lower four (corresponding to the second line), the median (corresponding to the middle line), the upper four (corresponding to the upper edge of the box), and the maximum value (corresponding to the topmost line), describes the distribution of the continuous variable. And the outliers can be listed.
For example, we also take the Mtcars data set, where MPG is the fuel consumption per hundred kilometers, Cyl is the number of engine cylinders, to compare the different number of cylinders to the number of miles per gallon of the impact of the mileage of the road can be plotted as follows:
BoxPlot (mpg ~ cyl,data=mtcars,main= ' Car milage data ', xlab= ' number of Cylinder ', ylab= ' Miles Per gallon ')
We can clearly see that the 4-cylinder engine has the highest effective utilization, the 6-cylinder is the most stable, the 8-cylinder utilization is low and not stable enough.
R Language Data Analysis series five