Seaborn introduction
We have learned some of the contents of Matplotlib before, but when using Matplotlib to call up very beautiful effects, it is often troublesome and troublesome.
Seaborn based on the core library of Matplotlib carries out a more advanced API encapsulation, which allows you to easily draw more beautiful graphics. Seaborn's beauty is mainly reflected in the more comfortable color matching and the more delicate style of graphic elements.
Seaborn installation
pip install seaborn
If the above command cannot be installed, you can use the following command to install
sudo pip install git+ https://github.com/mwaskom/seaborn.git
Seaborn Foundation
It is very convenient to use Seaborn to optimize image operation. Let's compare before and after optimization.
x = np.linspace (0, 1,100)
fig = plt.figure ()
ax = fig.add_ subplot(1,1,1)
ax.set_ title(u'This is Title')
#The four parameters represent the X and y-axis data respectively, 'B --' indicates that the color is blue, the style is dashed, and the last one indicates the label, which supports latex
ax.plot (x, x ** 2, 'b.', label=r'$y = x^{2}$')
#Auto generate legend
ax.legend ()
#Set the range of X and Y axes of the image from 0-1
ax.axis ([0, 1, 0, 1])
fig.show ()
Seaborn: visualization of data processing
Let's optimize. The optimization method is also very simple, just use the following two steps:
Import Seaborn as SNS
sns.set ()
Let's take a look at the complete code
Import Seaborn as SNS
sns.set ()
x = np.linspace (0, 1,100)
fig = plt.figure ()
ax = fig.add_ subplot(1,1,1)
ax.set_ title(u'This is Title')
#The four parameters represent the X and y-axis data respectively, 'B --' indicates that the color is blue, the style is dashed, and the last one indicates the label, which supports latex
ax.plot (x, x ** 2, 'b.', label=r'$y = x^{2}$')
#Auto generate legend
ax.legend ()
#Set the range of X and Y axes of the image from 0-1
ax.axis ([0, 1, 0, 1])
fig.show ()
Seaborn: visualization of data processing
You can see that the background becomes a light gray grid and the font is slightly adjusted.
We're up there sns.set () the default parameters are used as follows:
sns.set (context='notebook', style='darkgrid', palette='deep', font='sans-serif', font_ scale=1, color_ codes=False, rc=None)
Among them:
The context = '' parameter controls the default frame size, with {paper, notebook, talk, post} values. Among them, post > talk > notebook > paper.
The style = '' parameter controls the default style, including {darkgrid, whitegrid, dark, white, ticks}
Palette = the '' parameter is the default palette. They are {deep, muted, bright, pastel, dark, colorblind} and so on.
The remaining font = '' is used to set font, font_ Scale = set font size, color_ Codes = use previous color abbreviations such as' R 'instead of palette.
I'm not going to demonstrate one by one here.
Let's take a look at several commonly used Seaborn APIs
seaborn.lmplot
seaborn.lmplot () is a very useful method, it can automatically complete regression fitting when drawing two-dimensional scatter diagram. In SNS, we can also use regplot to complete the fitting, but the implot method is more advanced. Let's use the words in the official documents to explain
This function combines regplot() and FacetGrid. It is intended as a convenient interface to fit regression models across conditional subsets of a dataset.
import numpy as np
import pandas as pd
from pandas import Series,DataFrame
#Load drawing module
import matplotlib.pyplot as plt
import seaborn as sns
#Create data
data_ x = np.arange (1,100,2)
data_ y = 2*data_ X+ np.random.normal (10,30,len(data_ X)
data = DataFrame({'x': data_ x, 'y': data_ Y})
#Drawing
sns.lmplot (x='x', y='y',data=data)
plt.show ()
Seaborn: visualization of data processing
In addition to the above basic use, we can also set the length width ratio of the image and regress multiple sets of data at the same time. Let's take a simple example
Among them:
Hue can represent different label
Aspect represents the length width scale of the drawing result
data_ x = np.arange (1, 100, 2)
data_ y = 2*data_ X+ np.random.normal (10,30,len(data_ X)
data_ label = np.random.randint (low=0,high=2,size=50)
data = pd.DataFrame ({'x':data_ x, 'y':data_ y, 'label':data_ label})
temp = data[data['label']==1]['x'].values
data.loc [data['label']==1,'x'] = temp*2 + np.random.normal (10,30,len(temp))
#Drawing
sns.lmplot (x='x', y='y', hue='label', data=data, aspect=1.5)
plt.show ()
Seaborn: visualization of data processing
seaborn.PairGrid
seaborn.PairGrid () can be used to view the relationship between two dimension data. Of course, it is also very useful. For example, it is convenient for us to find strong correlation features in the process of data analysis
import numpy as np
import pandas as pd
from pandas import Series,DataFrame
#Load drawing module
import matplotlib.pyplot as plt
import seaborn as sns
#Create data
data_ 1 = np.arange (1,100,2)
data_ 2 = 2*data_ X+ np.random.normal (10,30,len(data_ 1)
data_ 3 = np.random.random (len(data_ 1)
col = np.random.choice (['1a','2b','3c'],len(data_ 1) This is used for classification
data = DataFrame({'x': data_ 1, 'y': data_ 2,'z':data_ 3,'col':col})
#Drawing
sns.PairGrid (data=data).map( plt.scatter )
plt.show ()
Seaborn: visualization of data processing
You can see that the values of the three dimensions are paired automatically, and nine scatter charts are drawn. In addition, we can add colors and draw them according to classification
import numpy as np
import pandas as pd
from pandas import Series,DataFrame
#Load drawing module
import matplotlib.pyplot as plt
import seaborn as sns
#Create data
data_ 1 = np.arange (1,100,2)
data_ 2 = 2*data_ X+ np.random.normal (10,30,len(data_ 1)
data_ 3 = np.random.random (len(data_ 1)
col = np.random.choice (['1a','2b','3c'],len(data_ 1) This is used for classification
data = DataFrame({'x': data_ 1, 'y': data_ 2,'z':data_ 3,'col':col})
#Drawing
sns.PairGrid (data=data,hue='col').map( plt.scatter )
plt.show ()
Seaborn: visualization of data processing
seaborn.PairGrid
Finally, let's take a look at drawing the combination of single variable and double variable.
import numpy as np
import pandas as pd
from pandas import Series,DataFrame
#Load drawing module
import matplotlib.pyplot as plt
import seaborn as sns
#Create data
data_ x = np.arange (1,100,2)+ np.random.normal (5,20,50)
data_ y = 2*data_ X+ np.random.normal (10,30,len(data_ X)
data = DataFrame({'x': data_ x, 'y': data_ Y})
#Drawing
sns.JointGrid (x='x', y='y',data=data).plot( sns.regplot , sns.distplot )
plt.show ()
Code last sns.JointGrid (). Plot () is used to set the style of the graph. Here we set the two variables as scatter chart, and the single variables on the top and right as histogram, so we can get the following effect.