Use R's Prophet package to make a simple timing prediction

Last Update:2018-07-26 Source: Internet

Author: User

Tags numeric

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Previous blog for the Prophet package upload their own code, today this blog I tell you about the Prophet package working principle, and I do some optimization of the model. The last use of the Prophet package is divided into four parts, namely reading data, set the holiday (singularity), training model, the output of the custom results of the four parts, and now I have to explain the last project separately.
First, Initialize: Load the model package and read the data library (prophet)
Library (DPLYR) #初始化数据 all<-read.csv (' d:/rdata/zjd/ts/all.csv ', na.string= ' na ', header=t) qb30<-read.csv (' d:/ Rdata/zjd/ts/qb30.csv ', na.string= ' na ', header=t) history30n <-data.frame (ds = seq (AS). Date (' 2016-01-01 '), as. Date (' 2017-09-11 '), by = ' d '), y = Qb30$yn)
Plot (history30n$ds,history30n$y) history45n <-data.frame (ds = seq (AS). Date (' 2016-01-01 '), as. Date (' 2017-09-11 '), by = ' d '), y = Qb45$yn)
Read the data this part of the need to say, the basic use of CSV format, the main point is to ensure that it is a column of data, and if it is a daily data, be sure to set a starting point of time, my last version is to create a dataset when the time set, because each time to be modified (because the financial product number of different periods, Very much), and then I used the variable instead, just to modify the variable once. The code is as follows:
Alln<-read.csv (' d:/rdata/zjd/ts/qball.csv ', na.string= ' na ', header=t) start<-c (' 2017-01-01 ')
End<-c (' 2017-10-22 ')
# Historyalln <-data.frame (ds = seq (AS). Date (' 2017-01-01 '), as. Date (' 2017-09-18 '), by = ' d '), y = Alln$yn)
History30 <-data.frame (ds = seq (AS). Date (start), as. Date (end), by = ' d '), y = alln$y2031) history45 <-data.frame (ds = seq (AS). Date (start), as. Date (end), by = ' d '), y = alln$y4050)
Second, set the holiday (data singularity) set holiday focus needs to be explained is in addition to the weekend, there are some domestic legal holidays, the duration is different, such as National Day and the Spring Festival lasts 7 days and above, other festivals are only 3 days, so the holiday size is different, the impact is different, must be differentiated treatment, For this I set two holiday variables, one is a small festival, lasting 3 days; one is a large festival, lasts 7 days, the code is as follows: Holiday1 <-Data_frame (
Holiday = ' Holiday1 ',
ds = AS. Date (C (' 2016-01-01 ', ' 2016-04-04 ', ' 2016-05-01 ',
' 2016-06-09 ', ' 2016-09-15 ', ' 2017-01-01 ',
' 2017-04-02 ', ' 2017-04-29 ', ' 2017-05-28 '),
Lower_window = 0,
Upper_window = 3
)
Holiday2 <-Data_frame (
Holiday = ' Holiday2 ',
ds = AS. Date (C (' 2016-02-07 ', ' 2016-10-01 ', ' 2017-01-27 ',
' 2017-10-01 '),
Lower_window = 0,
Upper_window = 7
In addition, there are some company promotional activities and for different groups of people or product marketing, etc., if there is time, try to reflect in the holiday variables, such as my company's activities have a member day, interest rate activities, interest rate activities, novice activities, invitation activities, usually some marketing activities, etc. I have entered all of the different holiday variables, and finally integrated into a holiday variable: holidays365<-bind_rows (holiday1,holiday2,vip365,raiserate365,raise3652, Raise3653,reducerate365,promotion)
Training model in order to train the model and the output model, I changed the parameter to make some automation improvements, let the parameters and the predicted value into a variable, so that can loop the parameter, then the output parameters and results I know what the most appropriate parameters.
Sea=c (0.5)
Cha=c (0.5,0.6)
Hol=c (0.5,5,500)
x<-295
y<-322
B3<-matrix (0,1,32)
For (I-in 1:length (SEA)) {
For (M in 1:length (cha)) {
For (n in 1:length (Hol)) {N90 <-Prophet (history90,
Holidays = HOLIDAYS90,
Seasonality.prior.scale=sea[i],
Changepoint.prior.scale=cha[m],
Holidays.prior.scale = Hol[n],
Mcmc.samples = 100,
interval.width=0.9,
Uncertainty.samples = +) Future90 <-make_future_dataframe (n90, periods =) a3<-cbind (' Max ', Sea[i],cha [M],hol[n],t (Forecast90$yhat[x:y])) B3=rbind (B3,A3)}}} B3<-b3[-1,]
Colnames (b) <-C ("Period", "Sea", "Cha", "Hol", As.character (as. Date (end), As.character (AS). Date (end) +1), As.character (as. Date (end) +2), As.character (as. Date (end) +3), As.character (as. Date (end) +4), As.character (as. Date (end) +5), As.character (as. Date (end) +6), As.character (as. Date (end) +7))
Write.csv (b3,file = "D:/rdata/zjd/ts/wkallc.csv")
I gave the above parameters relatively few, you can according to their own ideas, in the parameter group to add a number of values, but the results of the time is certainly also multiplied, oh, in addition to the results can be based on parameters and the corresponding results to choose the appropriate parameters after training model.
Iv. Output Custom Results I finally set up 6 parameter groups to train the model, but in order to predict the next 30 days of the deal data, I also need to see six results each time, and find the most appropriate results. Can you also automate this manual screening? After validating the data for about a week (not so rigorous, please forgive me), I think the difference between the results is not really big, so I took a simple arithmetic mean to filter, help me to find the final result directly, and in the output of the results can occur when the outliers, such as negative and maximum value, I did the optimization process directly, the code is as follows: C3<-cbind (' mean ', As.numeric (b3[,5]), Mean (As.numeric (b3[,6))), Mean (As.numeric (b3[,7)), Mean (As.numeric (b3[,8])), Mean (As.numeric (b3[,9)), Mean (As.numeric (b3[,10)), Mean (As.numeric (b3[,11)),
Mean (As.numeric (b3[,12)), Mean (As.numeric (b3[,13)), Mean (As.numeric (b3[,14)), Mean (As.numeric (b3[,15)), mean ( As.numeric (b3[,16])), Mean (As.numeric (b3[,17)), Mean (As.numeric (b3[,18)),
Mean (As.numeric (b3[,19)), Mean (As.numeric (b3[,20)), Mean (As.numeric (b3[,21)), Mean (As.numeric (b3[,22)), mean ( As.numeric (b3[,23])), Mean (As.numeric (b3[,24)), Mean (As.numeric (b3[,25)),
Mean (As.numeric (b3[,26)), Mean (As.numeric (b3[,27)), Mean (As.numeric (b3[,28)), Mean (As.numeric (b3[,29)), mean ( As.numeric (b3[,30])), Mean (As.numeric (b3[,31)), Mean (As.numeric (b3[,32)))) for (s in 1:dim (c) [1]) {
For (R in 2:dim (c) [2]) {
if (C[s,r]<0|as.numeric (C[s,r]) >500000000) {
C[s,r]<-as.numeric (C[s,r-1]) *0.8}
}}
Colnames (c) <-C ("Period", As.character (AS). Date (end), As.character (AS). Date (end) +1), As.character (as. Date (end) +2), As.character (as. Date (end) +3), As.character (as. Date (end) +4), As.character (as. Date (end) +5), As.character (as. Date (end) +6),
As.character (AS. Date (end) +7), As.character (as. Date (end) +8), As.character (as. Date (end) +9), As.character (as. Date (end) +10), As.character (as. Date (end) +11), As.character (as. Date (end) +12), As.character (as. Date (end) +13),
As.character (AS. Date (end) +14), As.character (as. Date (end) +15), As.character (as. Date (end) +16), As.character (as. Date (end) +17), As.character (as. Date (end) +18), As.character (as. Date (end) +19), As.character (as. Date (end) +20),
As.character (AS. Date (end) +21), As.character (as. Date (end) +22), As.character (as. Date (end) +23), As.character (as. Date (end) +24), As.character (as. Date (end) +25), As.character (as. Date (end) +26), As.character (as. Date (end) +27))
Write.csv (c,file = "D:/rdata/zjd/ts/wkalld.csv")
The result is a direct output of CSV, and allows business or operations to understand my forecasts so they can prepare for the day after tomorrow or next month.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More