Seaborn is a well-encapsulated library based on the PLT. Has a very strong mapping function.
1. Layout style setting (graphic style) and details setting
Drawing with Matplotlib:
Import NumPy as Npimport matplotlib as Mplimport matplotlib.pyplot as Pltx = Np.linspace (0, +, +) for I in range (1, 7): C1/>plt.plot (x, Np.sin (x + i *. 5) * (7-i)) Plt.show ()
Output:
Default system style with Seaborn:
Import Seaborn as SNS Import NumPy as NP Import Matplotlib as Mpl Import Matplotlib.pyplot as Plt # def sinplot (flip=1):x = np.linspace (0, +, +) for in range (1, 7): + i *. 5) * (7- i)) Sns.set () plt.show ()
Output:
Here are five drawing styles for Seaborn:
- Darkgrid
- Whitegrid
- Dark
- White
- Ticks
The following describes a common one, other available code to see for themselves
Whitegrid
Import Seaborn as SNS Import NumPy as NP Import Matplotlib as Mpl Import Matplotlib.pyplot as Pltsns.set_style ("whitegrid") # set style data = Np.random.normal (size= (6)) + Np.arange (6)/2 # Create a date Sns.boxplot ( Data=data) # make box diagram plt.show ()
Output:
This style can clearly see the value of the data and the corresponding relationship, but also very simple, it is recommended to use this figure.
Specify the axis distance:
#f, ax = plt.subplots () sns.violinplot (data) sns.despine (offset=10)
The value of offset is the axis distance
Hide the Left axis: (single axis display problem)
Sns.despine (Left=true)
Make a diagram of two themes:
import Seaborn as SNS import NumPy as NP import matplotlib as MPL import Matplotlib.pyplot as Pltsns.set_style ( " whitegrid " ) data = Np.random.normal (size= (6)) + Np.arange (6)/2with sns.axes_style ( " darkgrid " ): Sns.boxplot (data =data) plt.show () Sns.boxplot (data
=data) plt.show ()
With the inside is a style, outside is another
Drawing the page layout:
Import Seaborn as Snsimport NumPy as Npimport matplotlib as Mplimport Matplotlib.pyplot as Pltsns.set_context ("paper")
#除了paper还有别的布局, help view Plt.figure (Figsize= (8, 6)) #大小sns. Set () x = Np.linspace (0, +, +) for I in range (1, 7): Plt.plot (x, Np.sin (x + i *. 5) * (7-i)) Plt.show ()
2. Color palette
- Color is important.
- Color_palette () can pass in any color supported by matplotlib
- Color_palette () do not write parameter the default color
- Set_palette () sets the color of all graphs
6 Default Color Cycle themes: deep, muted, pastel, bright, dark, colorblind
Round artboards * * * *
When you have more than six categories to differentiate, the simplest way to do this is to draw evenly spaced colors in a circle of color space (such tones will keep the brightness and saturation unchanged). This is the default scenario for most when they need to use more colors than the current default color loop.
The most common method is to use the color space of HLS, which is a simple conversion of the RGB values.
Import Seaborn as Snsimport NumPy as Npimport matplotlib as Mplimport Matplotlib.pyplot as Pltsns.palplot (Sns.color_palett E ("HLS", 8)) Plt.show ()
Output:
Import Seaborn as SNS Import NumPy as NP Import Matplotlib as Mpl Import = Np.random.normal (size= (8)) + Np.arange (8)/2sns.boxplot (data=data,palette=sns.color_palette ("HLS", 8)) plt.show ()
Hls_palette () function to control the brightness and saturation of the color
- L-Luminance Lightness
- S-saturated saturation
Sns.palplot (Sns.hls_palette (8, l=.7, s=.9))
Using XKCD to name colors
Continuous color Plate
Color changes with the data, such as data more and more important color more and more deep
Sns.palplot (Sns.color_palette ("Blues"))
Output:
If you want to flip a gradient, you can add a _r suffix to the panel name:
Sns.palplot (Sns.color_palette ("Bugn_r"))
Tonal linear transformations (saturation and brightness)
Sns.palplot (Sns.cubehelix_palette (8, start=.75, rot=-.150))
Light_palette () and Dark_palette () call a custom continuous palette
Sns.palplot (Sns.light_palette ("green"))
Above is made from shallow to deep
Below is the dark darkening:
Sns.palplot (Sns.light_palette ("Navy", Reverse=true))
X, y = Np.random.multivariate_normal ([0, 0], [[1,-.5], [-.5, 1]], size=300= Sns.dark_palette (" Green", as_cmap=True) sns.kdeplot (x, y, cmap=pal);
Output:
3. Single variable analysis drawing
%matplotlib Inlineimport NumPy as Npimport pandas as Pdfrom scipy import stats, Integrateimport Matplotlib.pyplot as Pltim Port Seaborn as Snssns.set (color_codes=true) np.random.seed (SUM (Map (ord, "distributions"))
First import the library, specify a Gaussian distribution of the graph
Then draw a histogram:
x = Np.random.normal (size=100) sns.distplot (X,kde=false)
Sns.distplot (x, bins=20, kde=false) #bins指定直方图的宽度
If you want to plot the distribution of a data, you can:
x = Np.random.gamma (6, size=200) Sns.distplot (x, Kde=false, Fit=stats.gamma)
Data based on mean and covariance
mean, cov = [0, 1], [(1,. 5), (. 5, 1)] #mean为均值, cov covariance data = Np.random.multivariate_normal (mean, CoV, $) #生成200组 Data DF = PD. DataFrame (data, columns=["x", "Y"]) #数据类型为panda的dataframedf #输出
Observe the distribution of two variables: (Scatter chart)
Sns.jointplot (x= "x", y= "Y", DATA=DF);
Output:
If there is too much data, the dots are too dense and want to see the distribution:
X, y = np.random.multivariate_normal (mean, cov, 1000). Twith Sns.axes_style ("white"): #指定绘图风格 sns.jointplot (x=x, y=y, kind= "hex", color= "K") #kind =hex
4. Multivariate Analysis Drawing
Iris = Sns.load_dataset ("Iris") #传入数据sns. Pairplot (Iris)
Output:
There are four groups of data, diagonal because it is a single data, so it is a histogram of the individual data, scatter chart is obtained by two sets of data.
Regplot () and Lmplot () can both draw regression relationships, recommended Regplot ()
Sns.regplot (x= "Total_bill", y= "Tip", data=tips)
Output:
If the value is an integer, it is not appropriate to establish a regression model, such as:
Sns.regplot (data=tips,x= "size", y= "Tip")
Output:
We can add a small range of floats to it:
Sns.regplot (x= "size", y= "Tip", Data=tips, x_jitter=.05)
Output:
Off-Group Point
Violin chart
Import data First:
Import NumPy as Npimport pandas as Pdimport matplotlib as Mplimport Matplotlib.pyplot as Pltimport Seaborn as Snssns.set (s Tyle= "Whitegrid", color_codes=true) np.random.seed (SUM (Map (ord, "categorical")) Titanic = Sns.load_dataset ("Titanic ") Tips = Sns.load_dataset (" Tips ") Iris = Sns.load_dataset (" Iris ")
Normal drawing:
Sns.stripplot (x= "Day", y= "Total_bill", data=tips);
Output:
This can result in overlapping of data, affecting observation.
can add:
Sns.stripplot (x= "Day", y= "Total_bill", Data=tips, Jitter=true) #jitter =true
Output:
Not very good, so we can also:
Sns.swarmplot (x= "Day", y= "Total_bill", Data=tips)
So the output of the graph is left and right evenly:
You can also add a classification feature to the diagram:
Sns.swarmplot (x= "Day", y= "Total_bill", hue= "sex", data=tips)
Output:
Box diagram
- IQR is the statistical concept four spacing, the distance between the first/four and the third/four sub-points
- N = 1.5IQR If a value is >q3+n or < Q1-n, the outlier
Sns.boxplot (x= "Day", y= "Total_bill", hue= "Time", data=tips);
Output:
The point above is the outliers.
Violin Chart: (reflecting distribution)
Sns.violinplot (x= "Total_bill", y= "Day", hue= "Time", data=tips);
Output:
After the time classification is not intuitive and not good to see, we can:
Sns.violinplot (x= "Day", y= "Total_bill", hue= "sex", data=tips, Split=true);
Let spilt = True, making it visually pleasing:
The set trend of display values can be shown with a bar chart
Sns.barplot (x= "Sex", y= "survived", hue= "class", data=titanic);
The point graph can better describe the difference of variation
Sns.pointplot (x= "Sex", y= "survived", hue= "class", Data=titanic) #hue表示指标
For the point graph, you can also make the picture look good, set some parameters
Sns.pointplot (x= "Class", y= "survived", hue= "sex", Data=titanic, palette={"male": "G", "female": "M"}, markers=["^", "O"], linestyles=["-", "--"]);
Output:
Wide type data
Sns.boxplot (data=iris,orient= "H")
Orient = "H" to make the picture sideways
Multi-layer Panel classification diagram
This integrates the previous several, passing the type of diagram as a parameter
Sns.factorplot (x= "Day", y= "Total_bill", hue= "smoker", Data=tips)
Sns.factorplot (x= "Day", y= "Total_bill", hue= "smoker", Data=tips, kind= "bar") #kind为图的类型
Sns.factorplot (x= "Day", y= "Total_bill", hue= "smoker",
Output:
Sns.factorplot (x= "Time", y= "Total_bill", hue= "smoker", col= "Day", Data=tips, kind= "box", Size=4, aspect=.5) # Specify width and size
About Factorplot
Seaborn.factorplot (X=none, Y=none, Hue=none, Data=none, Row=none, Col=none, Col_wrap=none, estimator=, ci=95, n_boot= Units=none, Order=none, Hue_order=none, Row_order=none, Col_order=none, kind= ' point ', size=4, Aspect=1, orient= None, Color=none, Palette=none, Legend=true, Legend_out=true, Sharex=true, Sharey=true, Margin_titles=false, FACET_KWS =none, **kwargs) Parameters:?x,y,hue DataSet variable name? date DataSet DataSet name? Row,col more categorical variables to tile the variable names? Col_wrap maximum number of tiles per row integer? Estimator Vector-to-scalar mapping vectors in each cluster? CI confidence interval floating point number or none?n_boot integer that is used when calculating confidence intervals? The identifier of the units sampling unit for performing multistage bootstrap and repeating measurement design data variables or vector data? order, Hue_ Order corresponds to a list of sorted list strings? Row_order, Col_order corresponds to a list of sorted list strings? Kind: Option: Point default, Bar histogram, count frequency, Box box, violin violin, strip scatter, SW Arm scatter point size per polygon height (inch) scalar aspect aspect ratio scalar Orient direction "V"/"h" color matplotlib color palette palette seaborn color swatch or dictionary legend Hue's information plane Board True/false Legend_out Whether to extend the graphic and draw the information box to the right of the center True/false share{x,y} shared axis True/false
5, Facetgrid use method and draw multivariable
Import First:
Import NumPy as Npimport pandas as Pdimport Seaborn as Snsfrom scipy import statsimport matplotlib as Mplimport matplotlib . Pyplot as Pltsns.set (style= "ticks") np.random.seed (sum (Map (ord, "Axis_grids"))
First look at the data:
Tips = Sns.load_dataset ("Tips") Tips.head ()
Instantiate the diagram first:
g = SNS. Facetgrid (Tips, col= "Time")
g = SNS. Facetgrid (Tips, col= "Time") G.map (plt.hist, "Tip") #条形图, Tip for x-axis
g = SNS. Facetgrid (tips, col= "Sex", hue= "smoker") #g. Map (Plt.scatter, "Total_bill", "Tip", alpha=.7) #alpha为透明度g. Add _legend () #加入图例 (on the far right)
g = SNS. Facetgrid (Tips, row= "smoker", col= "Time", Margin_titles=true) G.map (sns.regplot, "size", "Total_bill", color= ". 1", fit _reg=false, x_jitter=.1) #fit_reg indicate that the return line is not to be drawn, X_jitter indicates a jitter interval
g = SNS. Facetgrid (Tips, col= "Day", Size=4, aspect=.5) #宽度和大小g. Map (Sns.barplot, "sex", "Total_bill") #先x后y
If you want to specify the order of the graphs:
From pandas Import categoricalordered_days = Tips.day.value_counts (). Indexprint (ordered_days) #CategoricalIndex ([' Sat ', ' Sun ', ' Thur ', ' Fri ']ordered_days = categorical ([' Thur ', ' Fri ', ' sat ', ' Sun ']) #指定顺序g = SNS. Facetgrid (Tips, row= "Day", Row_order=ordered_days, size=1.7, aspect=4,) G.map (Sns.boxplot, "Total_bill")
Pal = Dict (lunch= "Seagreen", dinner= "Gray") G = SNS. Facetgrid (Tips, hue= "Time", Palette=pal, size=5) #palette表示调色板g. Map (Plt.scatter, "Total_bill", "Tip", s=50, Alpha =.7, linewidth=.5, edgecolor= "white") #s表示点的大小g. Add_legend ()
g = SNS. Facetgrid (tips, hue= "Sex", palette= "Set1", size=5, hue_kws={"marker": ["^", "V"]}) #点的形状g. Map (plt.scatter, "total _bill "," Tip ", s=100, linewidth=.5, edgecolor=" white ") g.add_legend ();
With Sns.axes_style ("white"): g = SNS. Facetgrid (tips, row= "Sex", col= "smoker", Margin_titles=true, size=2.5) #指定风格g. Map (Plt.scatter, "Total_bill", " Tip ", color=" #334488 ", edgecolor=" white ", lw=.5); G.set_axis_labels (" Total bill (US Dollars) "," Tip "); #横轴与纵轴的名称g. Set (Xticks=[10, yticks=[2, 6,]); #横轴与纵轴要表现的值g. Fig.subplots_adjust (wspace=.02, hspace=.02); #子图之间的距离
Iris = Sns.load_dataset ("Iris") G = SNS. Pairgrid (Iris) #绘制多变量g. Map (Plt.scatter);
g = SNS. Pairgrid (Iris) G.map_diag (plt.hist) #指定对角线图的类型g. Map_offdiag (plt.scatter) #指定非对角线图的类型
g = SNS. Pairgrid (Iris, hue= "species") g.map_diag (plt.hist) g.map_offdiag (plt.scatter) g.add_legend ();
If you don't want to get all the features out, you can
g = SNS. Pairgrid (Iris, vars=["Sepal_length", "Sepal_width"], hue= "species") #指定需要的特征g. Map (Plt.scatter);
g = SNS. Pairgrid (Tips, hue= "size", palette= "Gnbu_d") #将颜色弄成渐变色g. Map (Plt.scatter, s=50, edgecolor= "white") G.add_legend ( );
6. Heat Map Drawing
Pilot storage:
%matplotlib inlineimport Matplotlib.pyplot as Pltimport NumPy as NP; Np.random.seed (0) Import Seaborn as Sns;sns.set ()
To provide random data with the randomness:
Uniform_data = Np.random.rand (3, 3) "" "[[0.0187898 0.6176355 0.61209572] [0.616934 0.94374808 0.6818203] [0.3595079 0.43703195 0.6976312]] "" Heatmap = Sns.heatmap (uniform_data)
Output:
Ax = Sns.heatmap (Uniform_data, vmin=0.2, vmax=0.5) #设置调色板上下限
Normal_data = Np.random.randn (3, 3) #随机数有负数print (normal_data) ax = Sns.heatmap (Normal_data, center=0) # Let the palette have a center of 0
Flights = Sns.load_dataset ("flights") #库提供的数据flights. Head ()
Flights = Flights.pivot ("Month", "year", "Passengers") print (flights) Ax = sns.heatmap (flights)
Output:
If you want the numbers to show up:
Ax = sns.heatmap (flights, annot=true,fmt= "D") #annot显示数字 FMT Set number format
Make the data in the diagram more obvious:
Ax = sns.heatmap (flights, linewidths=.5) #设置小格宽度
Custom colors:
Ax = sns.heatmap (flights, cmap= "Ylgnbu")
The road to machine learning--seaborn