Ggplot2
R's Graphing toolkit, you can use very simple statements to achieve very complex and beautiful results.
Qplot
Load Qplot
=<-diamonds[sample (Nrow (diamonds), [+]),] #对diamonds数据集进行抽样
#1. Visualize by basic classification of Color,size,shape
#1.1 Simple scatter plot (using color classification, diamonds of different colors are represented by dots of different colors)
#1.2. Simple scatter plots (using shape classification, different cutting methods are represented by different shapes of points)
#2. Drawing different types of charts: Geom parameters
Geom= "" In Qplot (x,y,data=data,geom= "") the type of graphic used to control the output
I. Two variable diagram
(1) geom= "points", default parameters, Plot scatter plot (x, y)
(2) geom= "Smooth" to draw smooth curves (based on loess, GAM, LM, RLM,GLM)
(3) geom= "BoxPlot" plot the box line diagram, when x is a property variable (factor), Y is a numeric variable
Ii. Single-Variable graphs
(4) geom= "Histogram", histogram
(5) geom= "density", nuclear density estimation map
(6) geom= "Bar", bar chart Barchart
Iii. Time Series
(7) geom= "line", linear chart, available for time series (when x=date)
(8) geom= "path", road map (see later)
# 2.1 Plot scatter plot + smooth line at the same time
Qplot (carat, price, data = Dsmall, Geom=c ("point", "smooth"))
#参数调整: method= "" and so on
# (a). method = "Loess", default smoothing algorithm, adjusts window width via span=, span=0 (fluctuation) to span=1 (smooth)
Qplot (carat, price, data = Dsmall, Geom = C ("point", "Smooth"), = "loess", span=0.2)
# (b). method = "Gam": GAM is more efficient than loess in big data and needs to be loaded into the MGCV package
= Dsmall, Geom = C ("point", "Smooth"), method= "gam", formula = Y ~ s (x))
# (c). Method= "LM", linear smoothing
Qplot (carat, price, data = Dsmall, Geom = C ("point", "Smooth"), = "LM")
# method= "LM", formula = y ~ NS (x, 3), three natural spline, requires loading splines package
= Dsmall, Geom = C ("point", "Smooth"), = "LM", formula = y ~ NS (x, 3))
# method = "RLM", robust linear model, less affected by outliers, need to load mass package
= Dsmall, Geom = C ("point", "Smooth"), = "RLM")
# 2.2:x is a property variable, y is a continuous variable, drawing BoxPlot
Qplot (color, Price/carat, data=diamonds,geom= "BoxPlot")
# 2.3: Single variable, histogram
Qplot (carat, data = diamonds, Geom = "histogram")
#2.4: Single variable, kernel density estimation diagram
Qplot (carat, data = diamonds, Geom = "Density")
# density drawing in different colors
Qplot (carat, data = diamonds, Geom = "Density", colour=color)
# 2.5 bar chart (histogram)
= Diamonds, Geom = "Bar")
= Diamonds, Geom = "Bar", weight = carat)
#2.6. time-/pop, data = economics, Geom = "line")
#2.7. Path plot
The relationship between the #如果要查看失业率 (Unemploy/pop) and the mean time to unemployment (uempmed), one method is to use a scatter plot, but doing so will result in the inability to observe a trend over time, and path plot uses shades of color to represent the year. As the color changes from light blue to dark blue, it is possible to observe the changing trends in the relationship between unemployment and unemployment time.
<-function (x) as. Posixlt (x) $year + 1900/Pop, uempmed, data = Economics, = "path", colour = year (date))
We have discussed how to use the appearance parameters to compare differences in different classifications in the same diagram. Facets can be compared by placing different subclasses in different graphs:
Qplot (carat, data = diamonds, facets = Color ~., Geom = "Histogram", Binwidth = 0.1, Xlim = C (0, 3))
Qplot (carat, data = diamonds, facets = Color ~., Geom = "Histogram", Binwidth = 0.1, Xlim = C (0, 3))
The following graphic adds new elements based on the beginning: faceted, multiple layers, and statistics. Facets and layers extend the data structures mentioned above: each layer of each facet has its own dataset. You can think of it as a three-dimensional array: Facets make up a two-dimensional plane, and the layer gives it an extension on the new dimension. In this example, the data on different layers is the same, but theoretically, different layers can have different data.
Qplot (displ, Hwy, data=mpg, facets =. ~ year) + Geom_smooth ()
Ggplot
Basic Drawing type:
These geometric elements are the basis of ggplot. They combine to form complex images. Most of them correspond to a specific drawing type.
Geom_area ()
Geom_bar ()
Geom_line ()
Geom_point ()
Geom_polygon ()
Geom_text ()
Geom_tile ()
> Library ("Ggplot2")>Head (MPG) manufacturer model DISPL Year CYL Trans DRV Cty1 Audi A4 1.8 1999 4 Auto (L5) F 182 Audi A4 1.8 1999 4 Manual (M5) F 213 Audi A4 2.0 4 Manual (M6) F 204 Audi A4 2.0 4 Auto (AV) f 215 Audi A4 2.8 1 999 6 Auto (L5) F 166 Audi A4 2.8 1999 6 Manual (M5) F 18Hwy FLclass1 29P Compact2 29P Compact3 31P Compact4 30P Compact5 26P Compact6 26P Compact> P <-ggplot (MPG, AES (x = cty, y = Hwy, colour =factor (year)))>Summary (P) Data:manufacturer, model, DISPL, year, cyl, trans, DRV, Cty, Hwy, FL,class[234x11]mapping:x= Cty, y = Hwy, colour =factor (year) Faceting:facet_null ()
Then is the geometry and statistics, the simple understanding is through the statistical transformation of the preceding elements to show, because the statistical transformation of the function stat begins with its own geometry, and the geometric function geom with its own statistical transformation, usually can achieve the purpose.
P + geom_point () #散点图
Ggplot (MPG, AES (x = DISPL)) + Geom_histogram (Aes (y = (.. Count:)), fill = "Steelblue", colour = "#808080", bin = 0.1) #直方图
Ggplot (MPG, aes (y = displ, x = Factor (cyl), fill = factor (cyl))) + Geom_boxplot () #盒图
Ggplot (Diamonds, AES (carat, price)) + stat_bin2d () #二维密度图
P + geom_point () + stat_smooth (method = "LM", se = F)
Ggplot (MPG, AES (x = cty, y = Hwy)) + Geom_point (AES (colour = factor (year)) + Stat_smooth (method = "LM", se = F) #请注意两 The difference between different ways
class class)) + geom_boxplot () + geom_jitter (alpha = 0.3) + = Element_blank (), Panel.background = element_rect (fill = NA, c Olour = "BLACK"))
Ggplot (MPG, AES (x = DISPL)) + Stat_bin (Aes (y =.. Density:, fill = factor (year)), = "#909090") + stat_density (AES (ymax = "density", colour = factor (year) ), = "line", size = 1.2) + facet_wrap (~year, ncol = 1)
The basic concepts in Ggplot2
Maps a variable in the data to a graph property. The mapping controls the relationship between the two.
Scale: The scale is responsible for controlling how graphic properties are displayed after mapping. In the specific form, it is the legend and the coordinate scale. Scale and mapping are closely related concepts.
Geometric objects (geometric): Geometric objects represent the graphical elements we see in the graph, such as points, lines, polygons, and so on.
Statistical transformation (Statistics): A calculation of the original data, such as a two-dollar scatter with a regression line.
coordinate system (coordinate): The coordinate system controls the axes and affects all graphical elements, and the axes can be transformed to meet different needs.
Layer: Data, maps, geometric objects, statistical transformations, etc. form a layer. Layers allow the user to build the graph step-by-step, making it easy to modify the layer individually.
Faceted (FACET): Conditional drawing, grouping data in a certain way, and drawing separately. Facet is the method and arrangement of controlling group drawing.
Summarize
The drawing function of Ggplot2 has yet to be further explored.
Come with me. Ggplot2 (1)