Use R's Prophet package to make a simple timing prediction

Source: Internet
Author: User
Tags numeric
Previous blog for the Prophet package upload their own code, today this blog I tell you about the Prophet package working principle, and I do some optimization of the model. The last use of the Prophet package is divided into four parts, namely reading data, set the holiday (singularity), training model, the output of the custom results of the four parts, and now I have to explain the last project separately.
First, Initialize: Load the model package and read the data library (prophet)
Library (DPLYR) #初始化数据 all<-read.csv (' d:/rdata/zjd/ts/all.csv ', na.string= ' na ', header=t) qb30<-read.csv (' d:/ Rdata/zjd/ts/qb30.csv ', na.string= ' na ', header=t) history30n <-data.frame (ds = seq (AS). Date (' 2016-01-01 '), as. Date (' 2017-09-11 '), by = ' d '), y = Qb30$yn)
Plot (history30n$ds,history30n$y) history45n <-data.frame (ds = seq (AS). Date (' 2016-01-01 '), as. Date (' 2017-09-11 '), by = ' d '), y = Qb45$yn)
Read the data this part of the need to say, the basic use of CSV format, the main point is to ensure that it is a column of data, and if it is a daily data, be sure to set a starting point of time, my last version is to create a dataset when the time set, because each time to be modified (because the financial product number of different periods, Very much), and then I used the variable instead, just to modify the variable once. The code is as follows:
Alln<-read.csv (' d:/rdata/zjd/ts/qball.csv ', na.string= ' na ', header=t) start<-c (' 2017-01-01 ')
End<-c (' 2017-10-22 ')
# Historyalln <-data.frame (ds = seq (AS). Date (' 2017-01-01 '), as. Date (' 2017-09-18 '), by = ' d '), y = Alln$yn)
History30 <-data.frame (ds = seq (AS). Date (start), as. Date (end), by = ' d '), y = alln$y2031) history45 <-data.frame (ds = seq (AS). Date (start), as. Date (end), by = ' d '), y = alln$y4050)
Second, set the holiday (data singularity) set holiday focus needs to be explained is in addition to the weekend, there are some domestic legal holidays, the duration is different, such as National Day and the Spring Festival lasts 7 days and above, other festivals are only 3 days, so the holiday size is different, the impact is different, must be differentiated treatment, For this I set two holiday variables, one is a small festival, lasting 3 days; one is a large festival, lasts 7 days, the code is as follows: Holiday1 <-Data_frame (
Holiday = ' Holiday1 ',
ds = AS. Date (C (' 2016-01-01 ', ' 2016-04-04 ', ' 2016-05-01 ',
' 2016-06-09 ', ' 2016-09-15 ', ' 2017-01-01 ',
' 2017-04-02 ', ' 2017-04-29 ', ' 2017-05-28 '),
Lower_window = 0,
Upper_window = 3
)
Holiday2 <-Data_frame (
Holiday = ' Holiday2 ',
ds = AS. Date (C (' 2016-02-07 ', ' 2016-10-01 ', ' 2017-01-27 ',
' 2017-10-01 '),
Lower_window = 0,
Upper_window = 7
In addition, there are some company promotional activities and for different groups of people or product marketing, etc., if there is time, try to reflect in the holiday variables, such as my company's activities have a member day, interest rate activities, interest rate activities, novice activities, invitation activities, usually some marketing activities, etc. I have entered all of the different holiday variables, and finally integrated into a holiday variable: holidays365<-bind_rows (holiday1,holiday2,vip365,raiserate365,raise3652, Raise3653,reducerate365,promotion)
Training model in order to train the model and the output model, I changed the parameter to make some automation improvements, let the parameters and the predicted value into a variable, so that can loop the parameter, then the output parameters and results I know what the most appropriate parameters.
Sea=c (0.5)
Cha=c (0.5,0.6)
Hol=c (0.5,5,500)
x<-295
y<-322
B3<-matrix (0,1,32)
For (I-in 1:length (SEA)) {
For (M in 1:length (cha)) {
For (n in 1:length (Hol)) {N90 <-Prophet (history90,
Holidays = HOLIDAYS90,
Seasonality.prior.scale=sea[i],
Changepoint.prior.scale=cha[m],
Holidays.prior.scale = Hol[n],
Mcmc.samples = 100,
interval.width=0.9,
Uncertainty.samples = +) Future90 <-make_future_dataframe (n90, periods =) a3<-cbind (' Max ', Sea[i],cha [M],hol[n],t (Forecast90$yhat[x:y])) B3=rbind (B3,A3)}}} B3<-b3[-1,]
Colnames (b) <-C ("Period", "Sea", "Cha", "Hol", As.character (as. Date (end), As.character (AS). Date (end) +1), As.character (as. Date (end) +2), As.character (as. Date (end) +3), As.character (as. Date (end) +4), As.character (as. Date (end) +5), As.character (as. Date (end) +6), As.character (as. Date (end) +7))
Write.csv (b3,file = "D:/rdata/zjd/ts/wkallc.csv")
I gave the above parameters relatively few, you can according to their own ideas, in the parameter group to add a number of values, but the results of the time is certainly also multiplied, oh, in addition to the results can be based on parameters and the corresponding results to choose the appropriate parameters after training model.
Iv. Output Custom Results I finally set up 6 parameter groups to train the model, but in order to predict the next 30 days of the deal data, I also need to see six results each time, and find the most appropriate results.       Can you also automate this manual screening? After validating the data for about a week (not so rigorous, please forgive me), I think the difference between the results is not really big, so I took a simple arithmetic mean to filter, help me to find the final result directly, and in the output of the results can occur when the outliers, such as negative and maximum value, I did the optimization process directly, the code is as follows: C3<-cbind (' mean ', As.numeric (b3[,5]), Mean (As.numeric (b3[,6))), Mean (As.numeric (b3[,7)), Mean (As.numeric (b3[,8])), Mean (As.numeric (b3[,9)), Mean (As.numeric (b3[,10)), Mean (As.numeric (b3[,11)),
Mean (As.numeric (b3[,12)), Mean (As.numeric (b3[,13)), Mean (As.numeric (b3[,14)), Mean (As.numeric (b3[,15)), mean ( As.numeric (b3[,16])), Mean (As.numeric (b3[,17)), Mean (As.numeric (b3[,18)),
Mean (As.numeric (b3[,19)), Mean (As.numeric (b3[,20)), Mean (As.numeric (b3[,21)), Mean (As.numeric (b3[,22)), mean ( As.numeric (b3[,23])), Mean (As.numeric (b3[,24)), Mean (As.numeric (b3[,25)),
Mean (As.numeric (b3[,26)), Mean (As.numeric (b3[,27)), Mean (As.numeric (b3[,28)), Mean (As.numeric (b3[,29)), mean ( As.numeric (b3[,30])), Mean (As.numeric (b3[,31)), Mean (As.numeric (b3[,32)))) for (s in 1:dim (c) [1]) {
For (R in 2:dim (c) [2]) {
if (C[s,r]<0|as.numeric (C[s,r]) >500000000) {
C[s,r]<-as.numeric (C[s,r-1]) *0.8}
}}
Colnames (c) <-C ("Period", As.character (AS). Date (end), As.character (AS). Date (end) +1), As.character (as. Date (end) +2), As.character (as. Date (end) +3), As.character (as. Date (end) +4), As.character (as. Date (end) +5), As.character (as. Date (end) +6),
As.character (AS. Date (end) +7), As.character (as. Date (end) +8), As.character (as. Date (end) +9), As.character (as. Date (end) +10), As.character (as. Date (end) +11), As.character (as. Date (end) +12), As.character (as. Date (end) +13),
As.character (AS. Date (end) +14), As.character (as. Date (end) +15), As.character (as. Date (end) +16), As.character (as. Date (end) +17), As.character (as. Date (end) +18), As.character (as. Date (end) +19), As.character (as. Date (end) +20),
As.character (AS. Date (end) +21), As.character (as. Date (end) +22), As.character (as. Date (end) +23), As.character (as. Date (end) +24), As.character (as. Date (end) +25), As.character (as. Date (end) +26), As.character (as. Date (end) +27))
Write.csv (c,file = "D:/rdata/zjd/ts/wkalld.csv")
The result is a direct output of CSV, and allows business or operations to understand my forecasts so they can prepare for the day after tomorrow or next month.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.