Python Data analysis notes-data visualization

Source: Internet
Author: User
Tags naming convention

Getting Started with data visualization matplotlib drawing

In order to use Matplotlib to draw the base image, you need to call the plot () function in the Matplotlib.pyplot sub-Library

As as npx=np.linspace (0) plt.plot (x,.   5+x) plt.plot (x,1+2*x,'--') plt.show ()

Logarithmic graphs

A logarithmic graph is actually a graph that is drawn using logarithmic coordinates. For a logarithmic scale, the interval represents the change in the magnitude of the value of the variable, which differs greatly from the linear scale. Logarithmic graphs are divided into two different types, one of which is called a double-logarithmic plot, which is characterized by a logarithmic scale of two axes, and the corresponding matplotlib function is Matplotlib.pyplot.loglog (). One axis of a semi-logarithmic graph uses a linear scale, and the other axis uses a logarithmic scale, and its corresponding Matplotlib API is the SEMILOGX () function and the smilogy () function. The power law behaves as a straight line on a double logarithmic graph, and a straight line represents an exponential law on a semi-logarithmic graph.

The Polyfit () function in NumPy can use polynomial to fit data

The Polyval () function in NumPy can be used to evaluate the polynomial obtained above.


Scatter chart

Scatter plots can visualize the relationship between two variables in a Cartesian coordinate system. In a scatter plot, the position of each data point is actually a value of two variables. Any relationship between variables can be signaled by a scatter plot. The uptrend pattern usually implies positive correlation. Bubble charts are an extension of scatter plots. In bubble charts, each data point is surrounded by a bubble, and it is named after the value of the third variable can be used to determine the relative size of a bubble.

The scatter () function provided by the Matplotlib API is used to implement a scatter plot.

Legends and annotations

The data graph has the following ancillary information

1. A legend used to describe each data series in the diagram. To do this, you can use the Legend () function provided by matplotlib to provide a corresponding label for each data series

2. Annotations to the points in the diagram. To do this, you can use the annotate () function provided by Matplotlib. The annotations generated by Matplotlib include both the label and the arrow components. This function provides several parameters to describe the form of a label and arrow and its position.

3. Labels for the horizontal and vertical axes. These tags can be drawn through the Xlabel () and Ylabel () functions.

4. A caption of a descriptive nature, usually provided by the title () function of matplotlib

5. Grids are very helpful for locating data points easily. The grid () function provided by Matplotlib can be used to determine whether the grid is enabled

Three-dimensional diagram

Axes3d is a class provided by the Matplotlib API that can be used to draw three-dimensional graphs. By explaining the workings of this class, you can understand the principles of the object-oriented Matplotlib API. Matplotlib's figure class is the top-level container for storing various image elements.

1. Create a Figure object

Fig=plt.figure ()

2. Create a Axes3d object with a Figure object

Ax=axes3d (Fig)

3. When creating a coordinate matrix, you can use the Meshgrid () function in NumPy

X,y=np.meshgrid (x, y)

4. Drawing images for data through the Plot_surface () method of the Axes3d class

Ax.plot_surface (x, Y, z)

5. According to the naming convention of the object-oriented API function, it should start with set_ and end with the function name corresponding to the program, as follows:

Ax.set_xlabel ('Year ') Ax.set_ylabel ('Log1') Ax.set_zlabel (' Log2') ax.set_title ('66666 ')
Pandas drawing

The plot () methods in the Pandas series and Dataframe classes encapsulate the relevant matplotlib functions

To create a semi-logarithmic graph, additional logy parameters are required

Df.plot (Logy=true)


To create a scatter plot, you need to set the parameter kind to scatter, and also specify two columns. In addition, if the parameter loglog is set to true, a double logarithm is generated (log-log graph)

df[df['gpu_trans_count']>0].plot (kind='scatter' , x='trans_count', y='gpu_trans_count', loglog= True)
Time delay diagram

A time-delay graph is actually a scatter plot, but the images of the time series and the images of the same sequence on the time axis are displayed together.

We can use the Lag_plot () function in Pandas Subpackage pandas.tools.plotting to draw time-delay graphs

Lag_plot (df['trans_count')
Self-correlation diagram

autocorrelation graphs describe the autocorrelation of time series data in different time delay situations. Self-correlation is the relationship between a time series and the same data at different time delay situations. By using the Autocorrelation_plot () function in the Pandas Subpackage pandas.tools.plotting, you can draw from the relevant diagram.

Python Data analysis notes-data visualization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.