Preliminary study on Matplotlib
Examples from this book: "Python programming from the start to the actual combat" "Beauty" Eric Matthes
Using Pyplot drawing, general import methodimport matplotlib.pyplot as plt
The following code is run in Jupyter notebook
Line chart
Let's look at a simple example
Import Matplotlib.pyplot as Pltin_values = [1, 2, 3, 4, 5]squares = [1, 4, 9, 16, 25]# the first parameter is the x-axis input, the second parameter is the corresponding y-axis output; LineWidth Paint The thickness of the line plt.plot (in_values, Squares, linewidth=4) # title, x-axis, y-axis plt.title (' squares ', fontsize=20) Plt.xlabel (' Value ', fontsize=12) Plt.ylabel (' Square of the value ', fontsize=12) # Plt.tick_params (axis= ' both ', labelsize=15) plt.show ()
As shown below, you can see that the x-axis is too dense, even with decimals.
If you want the x-axis to show only our sample values, you can use tick_params the function to modify the size of the tick marks. Uncomment the penultimate line in the above code to get the image below.
plt.tick_params(axis='both', labelsize=15), which axis=both represents a scale that affects both the X and Y axes, labelsize Specifies the size of the scale, the font size becomes larger, and the less coordinate points are displayed at the same length, and vice versa. Because the labelsize setting is larger than the default, the X and y axes show fewer coordinate points. More in line with this example.
Scatter chart
Or the square example above. This time it is plotted with a scatter plot.
In_values = [1, 2, 3, 4, 5]squares = [1, 4, 9,, 25]# s parameter is the size of the point plt.scatter (in_values, squares, s=80) plt.title (' squares ', fontsize=20) Plt.xlabel (' value ', fontsize=12) Plt.ylabel (' Square of the Value ', fontsize=12) plt.tick_params (axis= ' Both ', labelsize=15) plt.show ()
As you can see, the plt.plot plt.scatter rest of the code is basically unchanged.
If the input and output points are more, you can use the list deduction formula. It can also refer to the color of the point and the outline color of the dots. The default point color is blue and the outline is black.
X_values = List (range (1)) y_values = [x**2 for x in x_values]# C parameter specifies the color of the point, the color of the outline is not set (none) Plt.scatter (X_values, Y_valu ES, c= ' red ', edgecolors= ' none ', s=5) # x, y axis coordinate range, note provide a list, the first two are the x-axis range, the latter two are the y-axis range Plt.axis ([0, 0, 11000]) plt.show ()
Color customization can also use RGB mode, passing a tuple to parameter C. The tuple contains a number of three [0, 1], respectively, representing (R, G, B), the closer the number is to the 0, the darker the closer the 1 color. For example c=(0, 0 , 0.6) , it represents a light blue color.
Still is the square diagram, the person lazy will not write the title.
Color mapping
A color map is usually a gradient of a series of colors. In the visualization, the color map can reflect the law of the data, for example, the color light value is small, the color depth of the value is larger.
Look at a very simple example, mapping the size of the y-coordinate values.
X_values = List (range (1)) y_values = [x**2 for x in x_values]# color map, values from light to dark according to y-axis, color in blue plt.scatter (X_values, Y_values, C =x_values, Cmap=plt.cm.blues, edgecolors= ' None ', s=5) Plt.axis ([0, 110, 0, 11000]) # Replace the Show method, save the picture to the directory where the file is located, bbox_inches= ' Tight ' can cut off the excess white edge plt.savefig (' squares_plot.png ', bbox_inches= ' tight ')
As you can see, the dots with a small Y-value are very light and almost invisible; as the Y value increases, the color becomes darker.
Random Walk Simulation
Start by writing a random walk class to randomly select the direction of the move.
From random import Choicedef get_step (): "" " Get Moving Step " "" # for positive half axis and negative half axis respectively direction = Choice ([1,-1]) # Randomly select a distance distance = choice ([0, 1, 2, 3, 4]) step = direction * Distancereturn stepclass Randomwalk: "" a class that generates random walk data "" " # Default Stroll 5000 step def __init__ (self, num_points=5000): self.num_points = Num_pointsself.x_values = [0]self.y_values = [ 0]def Fill_walk (self): "" " calculates all points contained in a random stroll" "While Len (self.x_values) < self.num_points: x_step = get_ Step () y_step = Get_step () # No displacement, skip if X_step = = 0 and Y_step = = 0:continue# calculate the next point x and Y, first for both 0, previous position + just displacement = current position next_x = self.x_values[-1] + x_step next_y = self.y_values[-1] + y_stepself.x_values.append (next_x) self.y_ Values.append (next_y)
Start drawing
Import Matplotlib.pyplot as PLTRW = Randomwalk () rw.fill_walk () # Figure's Call before plot or scatter # plt.figure (dpi=300, figsize = (10, 6) # This list contains the order of the points, the first element will be the starting point of the stroll, the last element is the end of the Stroll point_numbers = List (range rw.num_points) # Use a color map to draw points with different shades of color, The light is the first stroll, the dark is after the stroll, so can reflect the wandering trajectory plt.scatter (rw.x_values, Rw.y_values, C=point_numbers, Cmap=plt.cm.blues, S=1) # Prominent beginnings Plt.scatter (0, 0, c= ' green ', edgecolors= ' none ', s=50) # Highlight the end Plt.scatter (Rw.x_values[-1], rw.y_values[-1], c= ' Red ', S=50) # Hide Axis plt.axes (). Get_xaxis (). Set_visible (False) Plt.axes (). Get_yaxis (). Set_visible (FALSE) # Specify resolution and image size, Unit is inch plt.show ()
Generated pictures, dense dots. It's pretty good to see from afar. Green is the beginning of the walk, the red is the end of the stroll.
But the picture is a little unclear, and rw.fill_walk() the following line of comments is canceled. Usually called before the drawing.
plt.figure(dpi=300, figsize=(10, 6)), which dpi=300 is 300 pixels per inch, this height can be adjusted to get a clear picture. figsize=(10, 6)the passed-in parameter is a tuple that represents the size of the drawing window, which is the size of the picture, in inches.
HD big picture, cool uncomfortable?
Working with CSV data
We may need to analyze the data provided by others. Typically JSON and CSV files in two formats. Here is the weather data for sitka_weather_2014.csv the 2014 U.S. Sitka. Here the CSV file is processed with matplotlib, and the processing of the JSON file is placed in the Pygal.
Download the data sitka_weather_2014.csv
The first line of the CSV file is usually the header, and the real data starts at the second line. Let's start by looking at what data the table header contains.
Import csvfilename = ' F:/jupyter notebook/matplotlib_pygal_csv_json/sitka_weather_2014.csv ' with open (filename) as F: reader = Csv.reader (f) # Call only once next to get the first row of the header Header_row = Next (reader) for index, Column_header in enumerate (header_ Row):p rint (index, Column_header)
Print as follows
0 AKST1 Max TemperatureF2 Mean TemperatureF3 min TemperatureF4 max Dew PointF5 meandew PointF6 Min DewpointF7 Max Humidity 8 Mean Humidity9 Min Humidity ...
We are interested in the highest and lowest temperatures, and only need to get the data from columns 1th and 3rd . In addition the date data is in the 1th column.
The
is not difficult next. Starting with the second line, put the highest temperature into the highs list, the lowest temperature in the lows list, the date into the dates list we want to display the date in the x-axis, introduce the DateTime module.
Import Csvimport Matplotlib.pyplot as pltfrom datetime import datetimefilename = ' F:/jupyter Notebook/matplotlib_pygal_ Csv_json/sitka_weather_2014.csv ' with open (filename) as F:reader = Csv.reader (f) # calls only once next to get the first row of the header Header_row = Next (r Eader) # The first column is the highest temperature, because the above next read a line, where the actual start from the second line, but also the beginning of the data line # Reader can only read once, so the following notation dates is empty # highs = [int (row[1]) for row in read er]# dates= [row[0] for row in reader]dates, highs, Lows = [], [], []for row in reader:# catches exception, prevents data from being empty in the case of try: Date = Datetime.strptime (Row[0], '%y-%m-%d ') # 1th column highest temperature, read to is string, to Inthigh = Int (row[1]) # 3rd column lowest temperature low = Int (row[3]) except V Alueerror:print (date, ' missing data ') else:dates.append (date) highs.append (high) Lows.app End (Low) # Figure before plot call fig = Plt.figure (dpi=300, figsize= (10, 6)) # High-Temperature Line chart Plt.plot (dates, highs, c= ' red ') # The lowest temperature line chart Plt.plot (dates, lows, c= ' Blue ') # fills the color between two Y values, Facecolor is the fill color, the alpha parameter specifies the color transparency, and 0.1 indicates that the color is very shallow close to transparent Plt.fill_ Between (dates, highs, lows, facecolor= ' Blue ', alpha=0.1) plt.title (' Daily High and Low temperatures-2014 ', fontsize=20) Plt.xlabel (', fontsize=16) Plt.ylabel (' Temperature (F ) ', fontsize=16) # x-axis date adjusted for oblique display fig.autofmt_xdate () plt.tick_params (axis= ' both ', labelsize=15) plt.show ()
To see, July to September is very hot, but May there have been very high temperature!
The above code has a line date = datetime.strptime(row[0], '%Y-%m-%d') . Note that %Y-%m-%d you want row[0] the format of the string to be consistent. As an example,
# The following sentence error time data ' 2017/6/23 ' does not match format '%y-%m-%d ' Print (datetime.strptime (' 2017/6/22 ', '%y-%m-%d ')) print ( Datetime.strptime (' 2017-6-22 ', '%y-%m-%d '))
%YRefers to a four-bit year, which %y is a two-digit year, %m a number representing the month, and a %d number representing the day of the month.
by @sunhaiyu
2017.6.22