Python Advanced (40)-Data visualization using Matplotlib for drawing preface
?? Matplotlib is an open source project based on the Python language, designed to provide Python with a data-drawing package. I'll cover the core objects of the Matplotlib API in this article, and explain how to use these objects to implement the drawing. In fact, Matplotlib's object system is rigorous and interesting, which provides a great space for users to play. Once the user is familiar with the core objects, they can easily customize the image. Matplotlib's object system is also an excellent example of computer graphics. Even if you are not a Python programmer, you can learn some general drawing principles from the text.
?? Matplotlib uses NumPy for array operations and calls a series of other Python libraries for hardware interaction. The core of Matplotlib is a set of object-composed drawing APIs.
?? In order to meet the requirement of graduation thesis, the author analyzes the law of purchasing drugs in the Internet purchasing platform-"instant delivery", and realizes the visualization of users ' drug purchase information. After learning the basics of Python, we decided to use Python in conjunction with Matplotlib to draw line, column, and pie charts, respectively, to count each platform user's monthly purchase (line or column chart-drug quantity trend) and the type of medicine (pie chart). The relevant knowledge is now organized as follows:
First, the Environment preparation
?? First, you need to download the installation-related toolkit, enter the following code, respectively, to install:
pip install numpy
pip install matplotlib
2. Start drawing a simple straight line drawing
import numpy as np
import matplotlib.pyplot as plt
#Set the x-axis range
x = [0,1]
#Set the y-axis range
y = [0,1]
#Create drawing objects
plt.figure ()
#Create a drawing object, the figuresize parameter can specify the width and height of the drawing object, the unit is inches, one inch = 80px
# plt.figure (figsize = (8,4))
#Draw on the current drawing object, the two parameters are the x and y axis data
plt.plot (x, y)
#Draw in the current drawing object (x-axis, y-axis, give the name of the curve drawn, line color, line width)
# plt.plot (x, y, label = "$ sin (x) $", color = "red", linewidth = 2)
#Set the x-axis label
plt.xlabel ("time (s)")
#Set y-axis label
plt.ylabel ("value (m)")
#Set title
plt.title ("A simple plot")
#Chart of the chart
plt.title ("PyPlot First Example")
# Y-axis range
plt.ylim (-1.2,1.2)
#Show icon
plt.legend ()
#Display image
plt.show ()
#Save image
plt.savefig ("easyplot.png")
line chart
#Draw on the current drawing object (X axis, Y axis, blue dotted line, line width)
plt.plot (x, y, "b-", linewidth = 1)
?? Simple realization of a line chart drawing of a given value, has not yet realized the part of Python to read Excel values, so the code is more cumbersome.
#-*-coding: utf-8-*-
from importlib import reload
import xlrd
from matplotlib import pyplot as plt
from matplotlib.ticker import MultipleLocator
import numpy as np
import sys
reload (sys)
# sys.setdefaultencoding (‘utf-8‘)
plt.rcParams [‘font.sans-serif‘] = [‘SimHei‘]
data0 = xlrd.open_workbook (‘F: /11data.xlsx’)
table0 = data0.sheets () [0]
source = []
# Daily date x value
source.extend (table0.col_values (0))
# Source (no repetition)
set1 = set (source)
# Store the source and the number corresponding to the source
dict1 = {}
# Get dict1
def getDict ():
for item in set1:
dict1.update ({item: source.count (item)})
return dict1
group_labels = ['1 day', '2 day', '3 day', '4 day', '5 day', '6 day', '7 day', '8 day', '9 day', '10 Day ',' 11th ',' 12th ',' 13th ',' 14th ',' 15th ',' 16th ',' 17th ',' 18th ',' 19th ', '20 Day ',' 21st ',' 22nd ',' 23rd ',' 24th ',' 25th ',' 26th ',' 27th ',' 28th ',' 29th ', '30 day']
# x = [u "1st", u "2nd", u "3rd", u "4th", u "5th", u "6th", u "7th", u "8th ", u" 9th ", u" 10th ", u" 11th ", u" 12th ", u" 13th ", u" 14th ", u" 15th ", u" 16th ", u "17th", u "18th", u "19th", u "20th", u "21st", u "22th", u "23rd", u "24th", u " 25th ", u" 26th ", u" 27th ", u" 28th ", u" 29th ", u" 30th "]
getDict ()
x = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 , 25,26,27,28,29,30]
y = [26,23,24,24,6,11,34,32,33,28,56,28,16,36,51,44,35,40,2,32,56,63,70,76 , 60,11,11,58,65,55]
labels = dict1.keys ()
sizes = dict1.values () #y
plt.figure ()
plt.plot (x, y, ‘-r’)
plt.plot (x, y, ‘ro’)
yminorLocator = MultipleLocator (5)
plt.xlabel ("Date")
plt.ylabel ("Number of News (Article)")
plt.title ("November 2016 News Number")
#Set the y-axis main scale label to a multiple of 0.5
ymajorLocator = MultipleLocator (10)
plt.xticks (x, group_labels, rotation = 45)
# plt.xlim (0, max (x))
# plt.savefig (s)
plt.show ()
plt.legend ()
plt.show ()
pie chart
?? The method used to draw the pie chart is: matplotlib.pyplot.pie () The parameters are:
pie (x, explode = None, labels = None,
colors = (‘b‘, ‘g’, ‘r’, ‘c’, ‘m’, ‘y’, ‘k‘, ‘w’)
autopct = None, pctdistance = 0.6, shadow = False,
labeldistance = 1.1, startangle = None, radius = None,
counterclock = True, wedgeprops = None, textprops = None,
center = (0, 0), frame = False)
??Parameter Description:
The ratio of x (each block), if sum (x)> 1 it will be normalized using sum (x)
labels (each piece) the explanatory text displayed on the outside of the pie chart
explode (each block) distance from center
startangle Start drawing angle, the default picture is drawn counterclockwise from the positive direction of the x-axis, if set = 90, it is drawn from the positive direction of the y-axis.
labeldistance label drawing position, relative to the radius ratio
shadow
labeldistance label drawing position, relative to the radius ratio, such as <1 is drawn inside the pie
autopct controls the percentage setting in the pie chart, you can use format string or format
function ’% 1.1f’ refers to the number of digits before and after the decimal point (not filled in with spaces)
pctdistance is similar to labeldistance, specifying the position scale of autopct
radius controls the radius of the pie chart
??return value:
?? If autopct is not set, return (patches, texts)
?? If autopct is set, return (patches, texts, autotexts)
?? patches – list – matplotlib.patches.Wedge object
?? texts autotexts – matplotlib.text.Text object
?? Complete code:
# coding = utf-8
__author__ = ‘leilu’
#The crawler results of the news about tobacco control in November 2016, the main media sources of the news are represented in the form of pie charts
import xlrd
from matplotlib import pyplot as plt
from fuzzywuzzy import fuzz
import numpy as np
import sys
#Because after sys is loaded, the setdefaultencoding method is deleted, so we need to re-import sys to set the system encoding
reload (sys)
#Change the system default encoding to utf-8
sys.setdefaultencoding (‘utf-8‘)
#Used to display Chinese tags normally
plt.rcParams [‘font.sans-serif‘] = [‘SimHei‘]
#Used to display negative sign normally
plt.rcParams [‘axes.unicode_minus‘] = False
#In some cases in Chinese, u‘content ’is required
#Read Excel workbook
data0 = xlrd.open_workbook (‘F: /11.xlsx’)
#Read worksheet
table0 = data0.sheets () [0]
source = []
# Sources of all news
source.extend (table0.col_values (2))
# Source (no repetition)
set1 = set (source)
# Store the source and the number corresponding to the source
dict1 = {}
# Get dict1
def getDict ():
for item in set1:
dict1.update ({item: source.count (item)})
#Generate output documents
f = open ("result.txt", ‘wb‘)
#Classify news sources with the same first two words as one
for i in dict1.keys ():
a = dict1.keys ()
if i in a:
a.remove (i)
else:
continue
# print dict1 [i]
for j in a:
# print dict1 [j]
if i.upper () [: 2] == j.upper () [: 2]:
if len (i)> len (j) and i in dict1.keys () and j in dict1.keys ():
f.write (i + ‘->‘ + j + ‘‘)
print str (dict1 [i]) + ‘->‘ + str (dict1 [j])
dict1 [j] + = dict1 [i]
dict1.pop (i)
elif i in dict1.keys () and j in dict1.keys ():
f.write (j + ‘->‘ + i + ‘‘)
print str (dict1 [i]) + ‘<-‘ + str (dict1 [j])
dict1 [i] + = dict1 [j]
dict1.pop (j)
f.close ()
dict1 [‘other’] = 0
for i in dict1.keys ():
#The number of entries is less than 5, the source is classified as other
if dict1 [i] <5:
dict1 [‘other’] + = dict1 [i]
dict1.pop (i)
return dict1
getDict ()
# Draw a pie chart
labels = dict1.keys ()
sizes = dict1.values ()
patches, l_text, p_text = plt.pie (sizes, labels = labels,
labeldistance = 1.2, autopct = ‘% 3.1f %%’, shadow = False, startangle = 90, pctdistance = 1.08)
for t in l_text:
t.set_size = (30)
for t in p_text:
t.set_size = (20)
# Set the x and y axes to be consistent so that the pie chart can be round
plt.axis (‘equal’)
plt.legend ()
plt.show ()
Set line shape, color, etc.
?? This article is a note to learn "matplotlib for python developers". You can set line parameters when plotting. Including: color, line style, marking style.
1) Control the color
?? The correspondence between colors is
?? b—blue c—cyan g—green k—-black
?? m—magenta r—red w—white y—-yellow
?? There are three ways to represent color:
?? a: use the full name b: hexadecimal, such as: # FF00FF c: RGB or RGBA tuple (1,0,1,1) d: gray intensity such as: ‘0.7’
2) Control line type
?? Correspondence between symbols and line types
??- solid line
??-short-term
??-. Short dot interphase line
??: Dotted line
3) Control mark style
?? There are many marking styles:
. Point marker
, Pixel marker
o Circle marker
v Triangle down marker
^ Triangle up marker
<Triangle left marker
> Triangle right marker
1 Tripod down marker
2 Tripod up marker
3 Tripod left marker
4 Tripod right marker
s Square marker
p Pentagon marker
Star marker
h Hexagon marker
H Rotated hexagon D Diamond marker
d Thin diamond marker
| Vertical line (vlinesymbol) marker
_ Horizontal line (hline symbol) marker
Plus marker
x Cross (x) marker
The following examples combine the above three types: the specific codes and effects are as follows:
import matplotlib.pyplot as plt
import numpy as np
y = np.arange (1, 3, 0.3)
plt.plot (y, ’cx–’, y + 1, ‘mo:’, y + 2, ‘kp-.’);
plt.show ()
import matplotlib.pyplot as plt
import numpy as np
y = np.arange (1, 3, 0.3)
plt.plot (y, ‘cx--‘, y + 1, ‘mo:‘, y + 2, ‘kp-.’);
plt.show ()
Related references
matplotlib official documentation
A Chinese version of the document (incomplete)
A more detailed analysis of matplotlib
Three kinds of drawing
Drawing of multiple pictures
Matplot api
Advanced Python (forty)-data visualization using matplotlib for drawing