Tushare is a free, open source Python financial data interface package. The main realization of the stock and other financial data from data collection, the process of cleaning and processing to data storage can provide financial analysts with fast, tidy, and diverse data for easy analysis, greatly reducing their workload in terms of data sources, and making them more focused on the research and implementation of strategies and models. Considering the advantages of Python pandas package in financial quantification analysis, most of the data formats returned by Tushare are pandas dataframe types, which makes it easy to use pandas/numpy/matplotlib for data analysis and visualization.
Its supporting access to the stock market data are: Transaction data, investment reference data, stock classification data, fundamental data, Billboard data, macro-economic data, news event data, interbank interest rate, and other major categories, each of the major categories under the subdivision of a number of small categories.
First, installation and upgrade
As with other Python modules, you can install them through the PIP, the Easy_install Toolkit, or through the source package.
Way 1:pip Install Tushare
Mode 2: Access https://pypi.python.org/pypi/tushare/download installation
From the source package on the GitHub can be seen, the author is very diligent, updated very quickly, so you can also upgrade by the following methods:
Pip Install Tushare–upgrade
II. Data Acquisition-related
Here are the most commonly used trading indicators for example, to do the summary.
1. Historical data
Import Tushare as TS
Ts.get_hist_data (' 600848 ') #一次性获取全部日k线数据
Ts.get_hist_data (' 600848 ', start= ' 2015-05-01 ', end= ' 2015-06-18 ') #指定时间区间
Ts.get_hist_data (' 600848 ', ktype= ' W ') #获取周k线数据
Ts.get_hist_data (' 600848 ', ktype= ' M ') #获取月k线数据
Ts.get_hist_data (' 600848 ', ktype= ' 5 ') #获取5分钟k线数据
Ts.get_hist_data (' 600848 ', ktype= ') #获取15分钟k线数据
Ts.get_hist_data (' 600848 ', ktype= ') #获取30分钟k线数据
Ts.get_hist_data (' 600848 ', ktype= ') #获取60分钟k线数据
Ts.get_hist_data (' sh ') #获取上证指数k线数据, other parameters are consistent with a stock, the same
Ts.get_hist_data (' sz ') #获取深圳成指k线数据
Ts.get_hist_data (' hs300 ') #获取沪深300指数k线数据
Ts.get_hist_data (' sz50 ') #获取上证50指数k线数据
Ts.get_hist_data (' Zxb ') #获取中小板指数k线数据
Ts.get_hist_data (' Cyb ') #获取创业板指数k线数据
The concept of the right to reinstatement is not understood, it is skipped here. Next look at the real-time data.
2, real-time data
Get all the market information of the day, can not specify a specific one of the market
Import Tushare as TS
Ts.get_today_all ()
Historical pens and Real time pens (trading disk statistics):
Import Tushare as TS
DF = ts.get_tick_data (' 600848 ', date= ' 2014-01-09 ')
Df.head (10)
DF = ts.get_today_ticks (' 601333 ') #当天历史分笔
Df.head (10)
Import Tushare as TS
DF = ts.get_realtime_quotes (' 000581 ') #Single stock symbol
df[[' code ', ' name ', ' Price ', ' bid ', ' ask ', ' volume ', ' Amount ', ' time ']
#symbols from a list
Ts.get_realtime_quotes ([' 600848 ', ' 000980 ', ' 000981 '])
#from a Series
Ts.get_realtime_quotes (df[' code '].tail) #一次获取10个股票的实时分笔数据
3, the market index
Import Tushare as TS
DF = Ts.get_index ()
4. New shares data
Get new data:
Import Tushare as TS
Ts.new_stocks ()
5, fundamental data
Fundamental data include many indicators of stock selection, such as: P/E ratio, market net rate, earnings per share, net profit, quarterly, accounts receivable turnover, net profit growth rate (%), flow ratio, quick ratio, cash flow ratio, etc.
Import Tushare as TS
Ts.get_stock_basics ()
#获取2015年第1季度的业绩报表数据
Ts.get_report_data (2015,1)
#获取2015年第1季度的盈利能力数据
Ts.get_profit_data (2015,1)
#获取2015年第1季度的营运能力数据
Ts.get_operation_data (2015,1)
#获取2015年第1季度的成长能力数据
Ts.get_growth_data (2015,1)
#获取2015年第1季度的偿债能力数据
Ts.get_debtpaying_data (2015,1)
#获取2015年第1季度的现金流量数据
Ts.get_cashflow_data (2015,1)
Third, data storage
Tushare itself provides commonly used data saving formats: CSV format, Excel format, HDF5 file format, JSON format, MySQL relational database, NoSQL database.
1, To_csv method
Import Tushare as TS
DF = ts.get_hist_data (' 000875 ')
#直接保存
Df.to_csv (' C:/day/000875.csv ')
#选择保存
Df.to_csv (' c:/day/000875.csv ', columns=[' open ', ' High ', ' low ', ' close '])
At some point, you might want to keep some of the same data in a large file, and then you need to append the data to the same file, for example:
Import Tushare as TS
Import OS
filename = ' c:/day/bigfile.csv '
For code in [' 000875 ', ' 600848 ', ' 000981 ']:
DF = Ts.get_hist_data (code)
If os.path.exists (filename):
Df.to_csv (filename, mode= ' a ', header=none)
Else
Df.to_csv (filename)
2, To_excel method
Import Tushare as TS
DF = ts.get_hist_data (' 000875 ')
#直接保存
Df.to_excel (' c:/day/000875.xlsx ')
#设定数据位置 (insert data starting from line 3rd, column 6th)
Df.to_excel (' c:/day/000875.xlsx ', startrow=2,startcol=5)
3, TO_HDF method
Import Tushare as TS
DF = ts.get_hist_data (' 000875 ')
DF.TO_HDF (' C:/day/hdf.h5 ', ' 000875 ')
Or
Import Tushare as TS
DF = ts.get_hist_data (' 000875 ')
store = Hdfstore (' c:/day/store.h5 ')
store[' 000875 '] = DF
Store.close ()
4, To_json method
Import Tushare as TS
DF = ts.get_hist_data (' 000875 ')
Df.to_json (' C:/day/000875.json ', orient= ' Records ')
#或者直接使用
Print Df.to_json (orient= ' Records ')
5, To_sql method
From SQLAlchemy import Create_engine
Import Tushare as TS
DF = ts.get_tick_data (' 600848 ', date= ' 2014-12-22 ')
Engine = Create_engine (' Mysql://user:passwd@127.0.0.1/db_name?charset=utf8 ')
#存入数据库
Df.to_sql (' Tick_data ', engine)
#追加数据到现有表
#df. To_sql (' Tick_data ', engine,if_exists= ' append ')
The following figure:
5. Write to MongoDB
The official example does not provide a way to write MongoDB directly, but MongoDB supports JSON-formatted input, where "curve Saves the Nation":
Import Pymongo
Import JSON
conn = Pymongo. Connection (' 127.0.0.1 ', port=27017)
DF = ts.get_tick_data (' 600848 ', date= ' 2014-12-22 ')
Conn.db.tickdata.insert (Json.loads (Df.to_json (orient= ' Records '))
Four, data drawing
It's all merely, and here comes a little bit of dry goods. The format of the Tushare processing output has been shaped so that it can be combined with the Pandas module to perform a good map, as follows:
Import Tushare as TS
Import Pandas as PD
Df=ts.get_hist_data (' 600415 ', start= ' 2015-04-01 ', end= ' 2015-06-18 ')
# All the results of the meeting chart
Df.plot ()
# only the highest value of the stock in the map
Df.high.plot ()
# specifies four quantities of the drawing and specifies the line color
With Pd.plot_params.use (' X_compat ', True):
Df.open.plot (color= ' g ')
Df.close.plot (color= ' y ')
Df.high.plot (color= ' R ')
Df.low.plot (color= ' B ')
# Specify the length and width of the drawing and the background grid
With Pd.plot_params.use (' X_compat ', True):
Df.high.plot (color= ' R ', figsize= (10,4), grid= ' on ')
Df.low.plot (color= ' B ', figsize= (10,4), grid= ' on ')
Four diagrams are drawn above, and only the fourth picture can be selected to see the effect:
By default, the above method will only output the picture, unable to save the picture, so you can save the picture to the specified location through the Matplotlib module's savefig function, the code is as follows:
Import matplotlib
Import tushare as TS
Import pandas as PD
Fig = MATPLOTLIB.PYPLOT.GCF ()
Df=ts.get_hist _data (' 600415 ', start= ' 2015-04-01 ', end= ' 2015-06-18 ')
with Pd.plot_params.use (' X_compat ', True):
Df.high.plot (color= ' R ', figsize= (10,4), grid= ' on ')
df.low.plot (color= ' B '), Figsize= (10,4), grid= ' on ')
fig.savefig (' f:/graph.png ')