The Arima algorithm is used to predict time series. _

The Arima algorithm is used to predict time series. __ algorithm

Last Update:2018-07-28 Source: Internet

Author: User

Tags diff mongoclient truncated statsmodels

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This paper takes Hongyong China as an example, extracts the data and uses the ARIMA algorithm to predict the time series.

Crawl data

# Crawl Line Kanhong China Fund
From BS4 import BeautifulSoup
Import requests

headers = {' Accept ': ' Text/javascript, Application/javascript, */*; q=0.01 ',
' accept-encoding ': ' gzip, deflate ',
' Accept-language ': ' zh-cn,zh;q=0.8 ',
' Connection ': ' Keep-alive ',
' Cookie ': ' vjuids=148cf0186.15e03abf2ac.0.c311af0ddaa6c; Advs=358187b0bd1a65; asl=17431,000pn,7010519170105191; JRJ_UID=15060593555978DJCIWMVNB; jrj_z3_newsid=723; ADVC=35686F6CAEEDF3; wt_fpc=id=2ef30c6a0af7eaf3a501506059355507:lv=1506063782501:ss=1506063782501; Channelcode=3763bexx; ylbcode=24s2az96; vjlast=1503300154.1506059356.23; hm_lvt_a07bde197b7bf109a325eebaee445939=1506059356; hm_lpvt_a07bde197b7bf109a325eebaee445939=1506063783 ',
' Host ': ' fund.jrj.com.cn ',
' Referer ': ' http://fund.jrj.com.cn/archives,968006,jjjz.shtml ',
' User-agent ': ' mozilla/5.0 (Windows NT 10.0; Win64; x64) applewebkit/537.36 (khtml, like Gecko) chrome/60.0.3112.90 safari/537.36 ',
' X-requested-with ': ' XMLHttpRequest '}

params = {' Fundcode ': ' 968006 ',
' obj ': ' obj ',
' Date ': 2017}

r = Requests.get (' Http://fund.jrj.com.cn/json/archives/history/netvalue? ', Params=params,headers=headers)
r.encoding = ' Utf-8 '
MyData = R.text

Storing data

# Extract standard JSON format data from a string
Table = mydata[8:]

# Convert strings to JSON without manual parsing
Myjson = json.loads (table)

# Extract Net Worth data
myjson[' Fundhistorynetvalue ']

From Pymongo import mongoclient

db = Mongoclient (' localhost ', 27017) [' Fund ']
Collect = db.get_collection (' hjhy ')
Collect.insert (myjson[' Fundhistorynetvalue ')
Print (' Done ')

Extract & Process data

From Pymongo import mongoclient
Import Pandas as PD
Import Time,datetime

db = Mongoclient (' localhost ', 27017) [' Fund ']
data = Dict ()

For item in Db.get_collection (' hjhy '). Find ():
Data[datetime.datetime.fromtimestamp (Time.mktime (Time.strptime (item[' enddate '), '%y-%m-%d '))] = item[' accum_net ' ]

Using the Arima model to predict

1. Build Time Series

# Build Time Series
My_series = PD. Series (data, Data.keys ())

# processing data types, converting str to float
My_series = my_series.apply (lambda x:float (x))

# Chronological ORDER by date
My_series = My_series.sort_index ()

2. View Trend Chart

Since the establishment of the Fund, the trend of price growth has changed.

%pylab
# Plot (my_series)
My_series.plot ()

The direct use of plot (my_series) will be more than a line to draw the first and last connection. or use My_series.plot () to call the object's own plot method.

3. Perform differential operation

From matplotlib import Pyplot as Plt

# First Order Difference
Fig = Plt.figure ()
diff1 = My_series.diff (1)
Diff1.plot ()

# Second Order Difference
Fig = Plt.figure ()
DIFF2 = My_series.diff (2)
Diff2.plot ()

4. First-order differential

5. Second Order Difference

6. View descriptive statistics

# first-order differential descriptive statistics
Diff1.dropna (Inplace=true)
Diff1.describe ()

Each time you do a differential, you will produce an NA, so remember to remove Na. The following results are descriptive statistics for DIFF1:

# second-order difference descriptive statistics
Diff2.dropna (Inplace=true)
Diff2.describe ()

The following results are descriptive statistics for DIFF2:

So it's enough to make a difference.

7. Determine p, q parameter values

Import Statsmodels.api as SM

Fig = Plt.figure ()

ax0 = Fig.add_subplot (211)
Fig = SM.GRAPHICS.TSA.PLOT_ACF (diff1, lags=30, ax=ax0)

Ax1 = Fig.add_subplot (212)
Fig = SM.GRAPHICS.TSA.PLOT_PACF (diff1, lags=30, AX=AX1)

This is the first order difference autocorrelation and partial correlation trend graph, although the first order difference's smoothness is slightly better than the second order difference, but P>0,MR (q) truncated; Q>0,ar (p) truncated.

Choose to use the second-order difference, the autocorrelation and partial correlation trend of the second-order difference is shown below:

5. Forecast

From Statsmodels.tsa.arima_model import Arima

Model = ARIMA (History_price, (2, 1)). Fit ()

Model.forecast (10) [0]

Actual value

Forecast value

Array ([1.41013409, 1.4134152, 1.41570651, 1.41638723, 1.42131414, 1.42299673, 1.42647455, 1.42795939, 1.43 099336, 1.43316138])

Welcome all onlookers, long according to identify two-dimensional code, focus on "data analysis notes" ~

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The Arima algorithm is used to predict time series. __ algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The Arima algorithm is used to predict time series. __ algorithm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support