Visualization of data (ii)

Source: Internet
Author: User

This article source: https://www.dataquest.io/mission/132/data-visualization-and-exploration

This data source Https://github.com/fivethirtyeight/data/blob/master/college-majors/recent-grads.csv

This article mainly describes how to simply explore the relationship between the data

Raw data Presentation (This is a salary survey report for college graduates, important fields have these, Major-professional name, Major_category-Professional category, Sample_size-sample size, sharewomen-female weight, total-the total person of the profession Number of

Import= pd.read_csv ('recent-grads.csv')

Histogram

To make a histogram, first divide the range of values of the x-axis into multiple intervals, then count the number of values contained in each interval, and then use that number as the y-axis value. Use the method pandas. Dataframe.hist () function

# Create a histogram of median wage income (median column) recent_grads.hist (' Median ')

# The hist () function is automatically divided into 10 equal portions by default, and the resulting graph has gridlines, which are now divided into 20 equal portions, eliminating grid lines recent_grads.hist ('Median', bins=20, Grid=false)

# In fact, you can make multiple histograms at once, the layout parameter means to divide two graphs into two rows and one column, if there is no parameter, the default will be all the graph on the same line  = ['Median','sample_size']recent_grads.hist ( Column=columns, layout= (2,1), Grid=false)

Box-type diagram

The box chart is a graphical summary of the data based on the five-number generalization (minimum, first four-digit, first four-digit (median), third four-digit, maximum), and also uses four-bit spacing IQR = Third four-digit-the first four-bit number. For more information, please Google

The box diagram is made using pandas. Dataframe.boxplot () method

ImportMatplotlib.pyplot as Plt#Select two columns of datasample_size= recent_grads[['sample_size','major_category']]#according to each type of professional classification statisticsSample_size.boxplot ( by='major_category')#rotates the coordinate text of the x-axis 90 degrees, showing verticallyplt.xticks (Rotation=90)

Multi-image merging

To find out the correlations between multiple variables, you would compare the changes of multiple variables on the same graph.

#Place two scatter plots together (color-coded) to see if the associatedImportMatplotlib.pyplot as Pltplt.scatter (recent_grads['unemployment_rate'], recent_grads['Median'], color='Red') Plt.scatter (recent_grads['Sharewomen'], recent_grads['Median'], color='Blue') plt.show ()

Visualization of data (ii)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.