One of the more important--ggplot2 in the R language, which was created by Hadley Wickham in 2005, was a major update in April 2012, and the author's current work is to rewrite the code, simplify the syntax, and make it easier for users to develop and use. Ggplot2 's core philosophy is to separate the drawing from the data, the data-related drawing and the data-independent drawing separation, is based on the layer mapping, is advantageous to the structural thinking, while it retains the command-type drawing adjustment function, makes it more flexible, the drawing out of the graphic beautiful, while avoiding the cumbersome details. Ggplot2 can construct unprecedented graphics from the underlying components, and you are limited to your imagination.
It can probably be divided into three parts:
(1) data layer (2) geometric layer (3) aesthetic layer;
If you've used Photoshop, you'll never be unfamiliar with layers. A layer is like a sheet of cellophane, it contains a variety of graphic elements, you can separate the layers can then be stacked together in different order, combined into the final effect of the graph. As a result, layers can allow the user to build drawings in step-by-steps, making it easier to modify the layers individually, add statistics, and even change the data, so the graphics you draw are generally pretty and consistent with your wishes.
Ggplot2 roughly contains a few basic concepts:
? Data and Mapping (Mapping)
? Scaling (Scale)
? Geometric objects (geometric)
? Statistical Transformations (Statistics)
? coordinate system (coordinate)
? Layers (layer)
? Faceted (facet)
Data and Mapping (Mapping)
Maps a variable in the data to a graph property. The mapping controls the relationship between the two.
Scaling (Scale)
Scaling is responsible for controlling how graphic properties are displayed after mapping. In the specific form, it is the legend and the coordinate scale. Scale and mapping are closely related concepts.
Geometric objects (geometric)
Geometric objects represent the graphical elements that we actually see in the diagram, such as points, lines, square blocks, and so on.
Statistical Transformations (statistics)
A statistical calculation of the original data, such as a two-yuan scatter plot plus a regression line or confidence interval registration.
Faceted (facet)
Conditional drawing, grouping data in a certain way, and then drawing separately. Facet is the method and arrangement of controlling group drawing.
We use Ggplot2 's own data set MPG, which contains information such as a subset of the EPA-enabled fuel economy data between 1999 and 2008. It has a total of 234 rows and 11 columns of data.
After the Ggplot2 package is loaded, you can use the following statement to draw a
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping = AES (X=cty,y=hwy)) +geom_point () +aes (Colour=factor (mpg$year))
where data=mpg,mapping = AES (X=cty,y=hwy) represents the data layer, Geom_point () represents the geometry layer, and AES (Colour=factor (Mpg$year)) represents the aesthetics layer. I map year to the map to the color property. If we write the above sentence as follows:
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping = AES (X=cty,y=hwy))
Then it will not draw anything because he lacks a geometry layer. The scatter plot plotted in the following statement is all Black dots because it lacks an aesthetic layer
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping = AES (X=cty,y=hwy)) +geom_point ()
If we think these points are too small or too large, we can change the size of the parameters to adjust the amount of scatter, the general method is size=i (x), this I () better add, or sometimes there will be strange errors, sometimes do not add also line, direct size=x also line, X is the size of the scatter, which is usually determined by the experience of the user or by trying to determine the size.
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping = AES (X=cty,y=hwy)) +geom_point (Size=i (7)) +aes (Colour=factor (mpg$year))
We can also plot its fitting curve and confidence interval, which draws two fit curves and confidence intervals according to the year.
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping = AES (X=cty,y=hwy)) +geom_point () +aes (Colour=factor (mpg$year)) +stat_smooth ()
But if we just want to draw a fitting curve and a confidence interval, we just need to change the code a little bit to get it done.
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping = AES (X=cty,y=hwy)) +geom_point (Aes (Colour=factor (Mpg$year))) +stat_smooth ()
In front, we map the year variable to the color of the scatter, and now we can also map the DISPL variable to the scatter size, drawing out the different sizes of the scatter points.
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping =aes (X=cty,y=hwy)) +
- Geom_point (Aes (Colour=factor (year), SIZE=DISPL)) +
- Stat_smooth ()
People who have used Photoshop must be aware of transparency, which is alpha. Of course our Ggplot2 package also provides relevant parameters that can change the alpha value to change the transparency of the scatter. The alpha value is between 0-1 and is not in this range. In order to have a noticeable difference from the previous picture, I have a smaller alpha value here. The general default alpha is worth the size of 1.
[Plain]View PlainCopy
- Ggplot (data=mpg,mapping = AES (X=cty,y=hwy)) +
- Geom_point (Aes (Colour=factor (Mpg$year), SIZE=DISPL), alpha=0.25) +
- Stat_smooth ()
Of course, we can also add a title to the image, and fine-tune the image, to add the x, Y axis description, as well as other instructions.
[Plain]View PlainCopy
- Ggplot (MPG, AES (X=CTY, Y=hwy)) +
- Geom_point (Aes (COLOUR=CLASS,SIZE=DISPL), alpha=0.6,position = "jitter") +
- Stat_smooth () +
- Scale_size_continuous (range = C (4, 10)) +
- Facet_wrap (~ year,ncol=1) +
- Ggtitle ("Car fuel consumption and model") +
- Labs (y= ' miles per gallon highway),
- x= ' per gallon City Road travel distance ') +
- Guides (Size=guide_legend (title= ' displacement '),
- Colour = guide_legend (title= ' model ',
- Override.aes=list (size=5)))
Written earlier blog or forum will have opts () function to name the image, but only in the previous Ggplot2 package version has opts () function, but now compared to the new Ggplot2 package there is no such function, If you use this function with a newer GGPLOT2 package, you will be prompted to not find the function because it has been replaced with ggtitle (), using the method as shown in the code above.
The above code has a position function, by adjusting the parameters to change the position of the layer, now ggplot2 only 5, and then back to the version of the update, may be increased. You can modify the parameters of the position function to observe how the graphs drawn by different parameters will change.
Fill: Fills the data first, then fills it to the top of the plot area.
Dodge: For example, different lines of the mark may be repeated, you can change the parameters of a "avoidance" way, that is, to flash next to, for example, the column chart side-by way.
Identity: Do not move, do not adjust the location, the default.
Jitter: If the serious overlap, random shaking, let the original overlap of the bare nod to
Stack: dogpile in vertical direction, drawn like a bar chart
At some point we need to make a comparison between a variable and a different variable, and this time we need to draw them on the same picture. Ggplot2 drawing has a very similar function to Photoshop, that is, layers, we can overlay different layers together, so that they are drawn on the same picture. For example, we want to study the relationship between Cty and Hwy in the MPG data, and displ, and we can draw together the scatter between them by the following code.
[Plain]View PlainCopy
- Ggplot () +
- Geom_point (Aes (X=mpg$cty,y=mpg$hwy), color= "red") +
- Geom_point (Aes (X=MPG$CTY,Y=MPG$DISPL), color= "green")
As a result, two dependent variables are represented by different colors.
Next, let's draw a pie chart, and we'll take a look at the proportions of each brand in class, with the following code and images:
[Plain]View PlainCopy
- Ggplot (MPG) +geom_bar (width=1, AES (X=factor (1), fill=mpg$class)) +
- Coord_polar (theta= "Y")
We can also use the Ggplot2 package to draw the Coxcomb plots (Crest Flower, aka Rose), which is also very simple. The width value adjusts the distance between the individual sector regions.
[Plain]View PlainCopy
- Ggplot (MPG, AES (x = Factor (Mpg$class))) +
- Geom_bar (width = 0.7,aes (Color=factor (Mpg$class))) + Coord_polar ()
Of course we can also add color to him, the code just a little change:
[Plain]View PlainCopy
- Ggplot (MPG, AES (x = Factor (Mpg$class), fill=mpg$class)) +
- Geom_bar (width = 0.7) + coord_polar ()
Reprint please specify the original link: http://blog.csdn.net/wzgl__wh/article/details/51901093
Quick Learning Ggplot2