In my local mysql_local_db database, a pandas data sheet was built to learn about the Pandas module.
1. Create a table
CREATE TABLE pandastest (city VARCHAR (255), User ID int (19), Order date datetime, amount DECIMAL (19,4), amount interval VARCHAR (255), Order number int (19), Last Order Date DATE, from the last order number of days int (19), the last amount of DECIMAL (19,4), from the Last order interval VARCHAR (255), category INT (+), key city, key res_id (user ID)) engine= INNODB DEFAULT Charset=utf8;
2, Kettle Import test data to deal with the data
3. Execute SQL query data
4. Write code in Pycharm
Connect MySQL database with Pandas module NumPy module to create an array matplotlib to draw
The code is as follows:
#coding: Utf-8import sysimport mysqldbfrom datetime import datetimeimport numpy as Npimport Matplotlib.pyplot as Pltimport The pandas as pd# system is encoded with the ' utf-8 ' Reload (SYS) sys.setdefaultencoding (' Utf-8 ') #连接mysql where Conn is the second parameter of the Pandas method Read_sql MySQLdb.connect (host= ' 127.0.0.1 ', user= ' root ', passwd= ' password ', port=3306,db= ' local_db ', charset= ' UTF8 ') sql= ' SELECT City, User ID, Order date, amount, amount range, order number from Pandastest WHERE order Date < ' 2016-12-26 ' limit 10000 ' ' #用pandas模块中read_sql方法获取数据表 (with table header and data) Real_sql contains two parameters one is the executed SQL here with the SQL variable instead of a string instead of a con=conndf = Pd.read_sql (sql, Con=conn) conn.close () # Practice splitting the Order Date field in DF into a month and day operation for loop to traverse date data Therefore, the Strftime method in the DateTime module converts the date type to character date_time=pd. DataFrame (X.strftime ("%y-%m-%d"). Split ('-') for x in df[' Order Date '), columns=[' year ', ' Month ', ' Day ') #将date_ Time split date and DF data rampage Merge Df=pd.merge (df,date_time,right_index=true, left_index=true) print df# aggregated by amount range jinequjian= Df.groupby (' Amount range ') [' Amount Range '].agg (len) Print jinequjian# chart font for Chinese fine black font for 11plt.rc (' font ', family= ' Stxihei ', size=11) # Create a one-dimensional array a=np.aRray ([1,2,3,4]) #创建条形图 The data source to set the color Transparency center alignment and the chart border Plt.barh ([1,2,3,4],jinequjian,color= ' #052B6C ', for Jinequjian this variable (amount interval rollup) Alpha=0.8,align= ' center ', edgecolor= ' white ') #y轴标题plt. Ylabel (' Amount range ') #x轴标题plt. Xlabel (' Number of customers ') #x轴长度plt. Xlim (0,8000) # Y-Axis length Plt.ylim (0,6) #图表的标题plt. Title (' Customer distribution for each amount of money ') #图例及显示位置plt. Legend ([' Number of customers '], loc= ' upper right ') #背景网格线的颜色样式 size and transparency Plt.grid (color= ' #375589 ', linestyle= '--', linewidth=2,axis= ' y ', alpha=0.4) #设置y轴上的数据分类名称和金额区间group by The fields remain consistent plt.yticks (A, (' ($ 000 to $500 inclusive) ', ' (500 to 1000 yuan inclusive) ', ' (1000 to 1500 yuan inclusive) ', ' (1500 Yuan +) ') #展现表plt. Show ()
Python 2.7_pandas connection MySQL data processing _20161229