In R language, the common graphs are histogram, box line, bar, dot matrix, pie chart, QQ chart.
1. Histogram
Histogram is a visual understanding of the data distribution of the commonly used graphics, it divides the continuous data into equal spacing groups, and the height of the rectangle to display the corresponding group of data contained in the number or frequency of the size, sometimes can display the density curve of the data as auxiliary. This is a quick and easy way to explore data distribution.
2. Box Line diagram
The box line diagram can show the data distribution in depth, it can not only give the location of important points, but also can peel out the anomaly point. The overall structure of the data is very clear if the position and value of the important indices such as mean are further marked.
3. Bar chart
A bar chart is similar to a column chart, except that the column chart is suitable for continuous data, forming several rectangles by human grouping to form a graph, and a bar chart for discrete variables, each of which is mapped to a bar.
4. Dot Matrix map
A bitmap is essentially the same as a bar chart, and is also used to present the distribution of the values of discrete variables, in the form of dots and background grids instead of bars.
5. Pie chart
A pie chart is a valid graph that examines the distribution of a single variable and is commonly used as a percentage to annotate.
6.QQ diagram
QQ Map is a scatter map, corresponding to the normal distribution of the scatter QQ map, it is labeled normal distribution of the number of the horizontal axis, the sample value as the ordinate. QQ chart can be used to test whether the sample obeys normal distribution.
For example, to obtain information about the income of managers earning more than 100,000 yuan annually, we have exploratory analysis of the 66 data in the pay, and are represented by histograms, dots, boxes and QQ graphs respectively.
The code is as follows:
Library (MASS) library (
grid) library (
lattice) library (splines) library (
Survival) Library
( Formula)
Library (hmisc)
pay = C (11,19,14,22,14,28,13,81,12,43,11,16,31,16,23,42,22,26,17,22,13,27,180,16 ,
43,82,14,11,51,76,28,66,29,14,14,65,37,16,37,35,39,27,14,17,13,38,28,40,85,32,
25,26,16,12,54,40,18,27,16,14,33,29,77,50,19,34
par (Mfrow = C (2,2))
#工作薪水的直方图
hist (pay)
# Dot Chart of work pay
Dotchart (pay)
#工作薪水的箱形图
boxplot (pay,horizontal=t) #工作薪水的Q-
q map
qqnorm (pay)
#线性回归
qqline (pay)
The effect is as follows:
Diagram (1) is a histogram, point map, Box line diagram and QQ chart in turn
From the above 4 pictures, you can see that a value is far from the other value, this is the exception value, you need to remove, binding vector pay know, this value is 180.