The road to machine learning--seaborn

Source: Internet
Author: User
Tags for in range scalar sin

Seaborn is a well-encapsulated library based on the PLT. Has a very strong mapping function.

1. Layout style setting (graphic style) and details setting

Drawing with Matplotlib:

Import NumPy as Npimport matplotlib as Mplimport matplotlib.pyplot as Pltx = Np.linspace (0, +, +) for I in range (1, 7): C1/>plt.plot (x, Np.sin (x + i *. 5) * (7-i)) Plt.show ()

Output:

Default system style with Seaborn:

Import Seaborn as SNS Import NumPy as NP Import Matplotlib as Mpl Import Matplotlib.pyplot as Plt # def sinplot (flip=1):x = np.linspace (0, +, +) for in range (1, 7):     + i *. 5) * (7- i)) Sns.set () plt.show ()

Output:

Here are five drawing styles for Seaborn:

    • Darkgrid
    • Whitegrid
    • Dark
    • White
    • Ticks

The following describes a common one, other available code to see for themselves

Whitegrid

Import Seaborn as SNS Import NumPy as NP Import Matplotlib as Mpl Import Matplotlib.pyplot as Pltsns.set_style ("whitegrid")     # set style data = Np.random.normal (size= (6)) + Np.arange (6)/2    # Create a date Sns.boxplot ( Data=data)       # make box diagram plt.show ()

Output:

This style can clearly see the value of the data and the corresponding relationship, but also very simple, it is recommended to use this figure.

Specify the axis distance:

#f, ax = plt.subplots () sns.violinplot (data) sns.despine (offset=10)

The value of offset is the axis distance

Hide the Left axis: (single axis display problem)

Sns.despine (Left=true)

Make a diagram of two themes:

 import   Seaborn as SNS  import   NumPy as NP  import   matplotlib as MPL  import   Matplotlib.pyplot as Pltsns.set_style (  " whitegrid  "  ) data  = Np.random.normal (size= (6)) + Np.arange (6)/2with sns.axes_style (  " darkgrid  "  ): Sns.boxplot (data  =data) plt.show () Sns.boxplot (data
    =data) plt.show ()  

With the inside is a style, outside is another

Drawing the page layout:

Import Seaborn as Snsimport NumPy as Npimport matplotlib as Mplimport Matplotlib.pyplot as Pltsns.set_context ("paper") 
   
     #除了paper还有别的布局, help view Plt.figure (Figsize= (8, 6))      #大小sns. Set () x = Np.linspace (0, +, +) for I in range (1, 7):    Plt.plot (x, Np.sin (x + i *. 5) * (7-i)) Plt.show ()
   

2. Color palette
    • Color is important.
    • Color_palette () can pass in any color supported by matplotlib
    • Color_palette () do not write parameter the default color
    • Set_palette () sets the color of all graphs

6 Default Color Cycle themes: deep, muted, pastel, bright, dark, colorblind

Round artboards * * * *

When you have more than six categories to differentiate, the simplest way to do this is to draw evenly spaced colors in a circle of color space (such tones will keep the brightness and saturation unchanged). This is the default scenario for most when they need to use more colors than the current default color loop.

The most common method is to use the color space of HLS, which is a simple conversion of the RGB values.

Import Seaborn as Snsimport NumPy as Npimport matplotlib as Mplimport Matplotlib.pyplot as Pltsns.palplot (Sns.color_palett E ("HLS", 8)) Plt.show ()

Output:

Import Seaborn as SNS Import NumPy as NP Import Matplotlib as Mpl Import  = Np.random.normal (size= (8)) + Np.arange (8)/2sns.boxplot (data=data,palette=sns.color_palette ("HLS", 8)) plt.show ()

Hls_palette () function to control the brightness and saturation of the color

    • L-Luminance Lightness
    • S-saturated saturation
Sns.palplot (Sns.hls_palette (8, l=.7, s=.9))

Using XKCD to name colors

Continuous color Plate

Color changes with the data, such as data more and more important color more and more deep

Sns.palplot (Sns.color_palette ("Blues"))

Output:

If you want to flip a gradient, you can add a _r suffix to the panel name:

Sns.palplot (Sns.color_palette ("Bugn_r"))

Tonal linear transformations (saturation and brightness)

Sns.palplot (Sns.cubehelix_palette (8, start=.75, rot=-.150))

Light_palette () and Dark_palette () call a custom continuous palette

Sns.palplot (Sns.light_palette ("green"))

Above is made from shallow to deep

Below is the dark darkening:

Sns.palplot (Sns.light_palette ("Navy", Reverse=true))
X, y = Np.random.multivariate_normal ([0, 0], [[1,-.5], [-.5, 1]], size=300= Sns.dark_palette ("  Green", as_cmap=True) sns.kdeplot (x, y, cmap=pal);

Output:

3. Single variable analysis drawing

%matplotlib Inlineimport NumPy as Npimport pandas as Pdfrom scipy import stats, Integrateimport Matplotlib.pyplot as Pltim Port Seaborn as Snssns.set (color_codes=true) np.random.seed (SUM (Map (ord, "distributions"))

First import the library, specify a Gaussian distribution of the graph

Then draw a histogram:

x = Np.random.normal (size=100) sns.distplot (X,kde=false)

Sns.distplot (x, bins=20, kde=false)   #bins指定直方图的宽度

If you want to plot the distribution of a data, you can:

x = Np.random.gamma (6, size=200) Sns.distplot (x, Kde=false, Fit=stats.gamma)

Data based on mean and covariance

mean, cov = [0, 1], [(1,. 5), (. 5, 1)]      #mean为均值, cov covariance data = Np.random.multivariate_normal (mean, CoV, $)    #生成200组 Data DF = PD. DataFrame (data, columns=["x", "Y"])    #数据类型为panda的dataframedf  #输出

Observe the distribution of two variables: (Scatter chart)

Sns.jointplot (x= "x", y= "Y", DATA=DF);    

Output:

If there is too much data, the dots are too dense and want to see the distribution:

X, y = np.random.multivariate_normal (mean, cov, 1000). Twith Sns.axes_style ("white"):    #指定绘图风格    sns.jointplot (x=x, y=y, kind= "hex", color= "K")     #kind =hex  

4. Multivariate Analysis Drawing

Iris = Sns.load_dataset ("Iris")    #传入数据sns. Pairplot (Iris)    

Output:

There are four groups of data, diagonal because it is a single data, so it is a histogram of the individual data, scatter chart is obtained by two sets of data.

Regplot () and Lmplot () can both draw regression relationships, recommended Regplot ()

Sns.regplot (x= "Total_bill", y= "Tip", data=tips)

Output:

If the value is an integer, it is not appropriate to establish a regression model, such as:

Sns.regplot (data=tips,x= "size", y= "Tip")

Output:

We can add a small range of floats to it:

Sns.regplot (x= "size", y= "Tip", Data=tips, x_jitter=.05)

Output:

Off-Group Point

Violin chart

Import data First:

Import NumPy as Npimport pandas as Pdimport matplotlib as Mplimport Matplotlib.pyplot as Pltimport Seaborn as Snssns.set (s Tyle= "Whitegrid", color_codes=true) np.random.seed (SUM (Map (ord, "categorical")) Titanic = Sns.load_dataset ("Titanic ") Tips = Sns.load_dataset (" Tips ") Iris = Sns.load_dataset (" Iris ")

Normal drawing:

Sns.stripplot (x= "Day", y= "Total_bill", data=tips);

Output:

This can result in overlapping of data, affecting observation.

can add:

Sns.stripplot (x= "Day", y= "Total_bill", Data=tips, Jitter=true)  #jitter =true

Output:

Not very good, so we can also:

Sns.swarmplot (x= "Day", y= "Total_bill", Data=tips)

So the output of the graph is left and right evenly:

You can also add a classification feature to the diagram:

Sns.swarmplot (x= "Day", y= "Total_bill", hue= "sex", data=tips)   

Output:

Box diagram
    • IQR is the statistical concept four spacing, the distance between the first/four and the third/four sub-points
    • N = 1.5IQR If a value is >q3+n or < Q1-n, the outlier
Sns.boxplot (x= "Day", y= "Total_bill", hue= "Time", data=tips);

Output:

The point above is the outliers.

Violin Chart: (reflecting distribution)

Sns.violinplot (x= "Total_bill", y= "Day", hue= "Time", data=tips);

Output:

After the time classification is not intuitive and not good to see, we can:

Sns.violinplot (x= "Day", y= "Total_bill", hue= "sex", data=tips, Split=true);

Let spilt = True, making it visually pleasing:

The set trend of display values can be shown with a bar chart

Sns.barplot (x= "Sex", y= "survived", hue= "class", data=titanic);

The point graph can better describe the difference of variation

Sns.pointplot (x= "Sex", y= "survived", hue= "class", Data=titanic)  #hue表示指标

For the point graph, you can also make the picture look good, set some parameters

Sns.pointplot (x= "Class", y= "survived", hue= "sex", Data=titanic,              palette={"male": "G", "female": "M"},              markers=["^", "O"], linestyles=["-", "--"]);

Output:

Wide type data

Sns.boxplot (data=iris,orient= "H")

Orient = "H" to make the picture sideways

Multi-layer Panel classification diagram

This integrates the previous several, passing the type of diagram as a parameter

Sns.factorplot (x= "Day", y= "Total_bill", hue= "smoker", Data=tips)
Sns.factorplot (x= "Day", y= "Total_bill", hue= "smoker", Data=tips, kind= "bar")  #kind为图的类型
Sns.factorplot (x= "Day", y= "Total_bill", hue= "smoker",               

Output:

Sns.factorplot (x= "Time", y= "Total_bill", hue= "smoker",               col= "Day", Data=tips, kind= "box", Size=4, aspect=.5) # Specify width and size

About Factorplot

Seaborn.factorplot (X=none, Y=none, Hue=none, Data=none, Row=none, Col=none, Col_wrap=none, estimator=, ci=95, n_boot= Units=none, Order=none, Hue_order=none, Row_order=none, Col_order=none, kind= ' point ', size=4, Aspect=1, orient= None, Color=none, Palette=none, Legend=true, Legend_out=true, Sharex=true, Sharey=true, Margin_titles=false, FACET_KWS =none, **kwargs) Parameters:?x,y,hue DataSet variable name? date DataSet DataSet name? Row,col more categorical variables to tile the variable names? Col_wrap maximum number of tiles per row integer? Estimator Vector-to-scalar mapping vectors in each cluster? CI confidence interval floating point number or none?n_boot integer that is used when calculating confidence intervals? The identifier of the units sampling unit for performing multistage bootstrap and repeating measurement design data variables or vector data? order, Hue_ Order corresponds to a list of sorted list strings? Row_order, Col_order corresponds to a list of sorted list strings? Kind: Option: Point default, Bar histogram, count frequency, Box box, violin violin, strip scatter, SW Arm scatter point size per polygon height (inch) scalar aspect aspect ratio scalar Orient direction "V"/"h" color matplotlib color palette palette seaborn color swatch or dictionary legend Hue's information plane Board True/false Legend_out Whether to extend the graphic and draw the information box to the right of the center True/false share{x,y} shared axis True/false

5, Facetgrid use method and draw multivariable

Import First:

Import NumPy as Npimport pandas as Pdimport Seaborn as Snsfrom scipy import statsimport matplotlib as Mplimport matplotlib . Pyplot as Pltsns.set (style= "ticks") np.random.seed (sum (Map (ord, "Axis_grids"))

First look at the data:

Tips = Sns.load_dataset ("Tips") Tips.head ()

Instantiate the diagram first:

g = SNS. Facetgrid (Tips, col= "Time")

g = SNS. Facetgrid (Tips, col= "Time") G.map (plt.hist, "Tip")   #条形图, Tip for x-axis
g = SNS. Facetgrid (tips, col= "Sex", hue= "smoker")   #g. Map (Plt.scatter, "Total_bill", "Tip", alpha=.7)   #alpha为透明度g. Add _legend ()     #加入图例 (on the far right)

g = SNS. Facetgrid (Tips, row= "smoker", col= "Time", Margin_titles=true) G.map (sns.regplot, "size", "Total_bill", color= ". 1", fit _reg=false, x_jitter=.1)   #fit_reg indicate that the return line is not to be drawn, X_jitter indicates a jitter interval

g = SNS. Facetgrid (Tips, col= "Day", Size=4, aspect=.5)    #宽度和大小g. Map (Sns.barplot, "sex", "Total_bill")    #先x后y

If you want to specify the order of the graphs:

From pandas Import categoricalordered_days = Tips.day.value_counts (). Indexprint (ordered_days)            #CategoricalIndex ([' Sat ', ' Sun ', ' Thur ', ' Fri ']ordered_days = categorical ([' Thur ', ' Fri ', ' sat ', ' Sun '])    #指定顺序g = SNS. Facetgrid (Tips, row= "Day", Row_order=ordered_days,                  size=1.7, aspect=4,) G.map (Sns.boxplot, "Total_bill")

Pal = Dict (lunch= "Seagreen", dinner= "Gray") G = SNS. Facetgrid (Tips, hue= "Time", Palette=pal, size=5)      #palette表示调色板g. Map (Plt.scatter, "Total_bill", "Tip", s=50, Alpha =.7, linewidth=.5, edgecolor= "white")   #s表示点的大小g. Add_legend ()
g = SNS. Facetgrid (tips, hue= "Sex", palette= "Set1", size=5, hue_kws={"marker": ["^", "V"]})    #点的形状g. Map (plt.scatter, "total _bill "," Tip ", s=100, linewidth=.5, edgecolor=" white ") g.add_legend ();      
With Sns.axes_style ("white"):    g = SNS. Facetgrid (tips, row= "Sex", col= "smoker", Margin_titles=true, size=2.5)    #指定风格g. Map (Plt.scatter, "Total_bill", " Tip ", color=" #334488 ", edgecolor=" white ", lw=.5); G.set_axis_labels (" Total bill (US Dollars) "," Tip ");     #横轴与纵轴的名称g. Set (Xticks=[10, yticks=[2, 6,]);     #横轴与纵轴要表现的值g. Fig.subplots_adjust (wspace=.02, hspace=.02);      #子图之间的距离

Iris = Sns.load_dataset ("Iris") G = SNS. Pairgrid (Iris)      #绘制多变量g. Map (Plt.scatter);       
g = SNS. Pairgrid (Iris) G.map_diag (plt.hist)    #指定对角线图的类型g. Map_offdiag (plt.scatter)          #指定非对角线图的类型
g = SNS. Pairgrid (Iris, hue= "species") g.map_diag (plt.hist) g.map_offdiag (plt.scatter) g.add_legend ();

If you don't want to get all the features out, you can

g = SNS. Pairgrid (Iris, vars=["Sepal_length", "Sepal_width"], hue= "species")   #指定需要的特征g. Map (Plt.scatter);
g = SNS. Pairgrid (Tips, hue= "size", palette= "Gnbu_d")   #将颜色弄成渐变色g. Map (Plt.scatter, s=50, edgecolor= "white") G.add_legend ( );

6. Heat Map Drawing

Pilot storage:

%matplotlib inlineimport Matplotlib.pyplot as Pltimport NumPy as NP; Np.random.seed (0) Import Seaborn as Sns;sns.set ()

To provide random data with the randomness:

Uniform_data = Np.random.rand (3, 3) "" "[[0.0187898   0.6176355   0.61209572] [0.616934    0.94374808  0.6818203] [0.3595079   0.43703195  0.6976312]] "" Heatmap = Sns.heatmap (uniform_data)

Output:

Ax = Sns.heatmap (Uniform_data, vmin=0.2, vmax=0.5)    #设置调色板上下限
Normal_data = Np.random.randn (3, 3)    #随机数有负数print (normal_data) ax = Sns.heatmap (Normal_data, center=0)     # Let the palette have a center of 0
Flights = Sns.load_dataset ("flights")    #库提供的数据flights. Head ()  
Flights = Flights.pivot ("Month", "year", "Passengers") print (flights) Ax = sns.heatmap (flights)

Output:

If you want the numbers to show up:

Ax = sns.heatmap (flights, annot=true,fmt= "D")    #annot显示数字 FMT Set number format

Make the data in the diagram more obvious:

Ax = sns.heatmap (flights, linewidths=.5)  #设置小格宽度

Custom colors:

Ax = sns.heatmap (flights, cmap= "Ylgnbu")

The road to machine learning--seaborn

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.