Operating system: Windows
python:3.5
Welcome to join the Learning Exchange QQ Group: 657341423
The previous section describes the library of data analysis and mining needs, the most important of which is pandas,matplotlib.
Pandas: Mainly on data analysis, calculation and statistics, such as the average, square bad.
Matplotlib: The main combination of pandas to generate images. Both are often used in combination.
Pandas:
The image above is for objects Dataframe or series
For Dataframe and series differences, refer to the official website instructions to know the data structure.
Regression to the use of the above diagram
Explain:
Read the data in Excel and generate data. Then filter the data, this filter is optional, and then directly using Data.describe () can be. Because object data is dataframe format.
If other methods are used, Data.sum (), Data.var () can be drawn, as follows
This is only given to other statistical methods, which need to be computed if other data, such as extreme difference, four-digit spacing, are needed. Here's what you can do:
Explain:
Statistics is for data.describe () and cannot be directly data.
statistics.loc[' AA '] = statistics.loc[' 75% ']-statistics.loc[' 25% '] #四分位数间距
statistics.loc[' AA ' can be custom named. Which is what we often say about new statistics.
Pandas cumulative statistical feature function
Usage:
Results:
Here is the statistics 2 line and, because and I set the window=2 related. You will find that the first number is Nan, because the first line is not 2 lines and cannot be added.
Matplotlib Drawing:
Before drawing, basically do some setup:
Import Matplotlib.pyplot as Plt #导入图像库
plt.rcparams[' font.sans-serif '] = [' Simhei '] #用来正常显示中文标签
plt.rcparams[' axes.unicode_minus ' = False #用来正常显示负号
Plt.figure (figsize= (7,5)) #建立图像, create the image area, figsize= (7,5) Specify the proportions
The default proportions can be: plt.figure ()
if combined with pandas use, usage
Data.plot (kind= ' bar ')
Kind parameter functions specify drawing type, line (lines), bar (bar), Barh, hist (Straight square), Box (box line diagram), KDE (density map), area, pie (pie chart), scatter (scatter chart)
Data is Dataframe or series.
Basically, the combination of both Pandas,matplotlib and drawing is this method.
Example:
This is to generate graphics from Excel data.
If it's a pie chart
Resources
Resources Source Files
If matplotlib to be used alone, the usage is different from the above.
Reference