R Language and Data Analysis VII: Simple exponential smoothing of time series

Last Update:2014-12-18 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

We have a complete understanding of the time series sequence and decompose the time series, and today we share the simplest of the common predictive algorithms with the small partners: simple exponential smoothing. Simple exponential smoothing applies to the available additive model descriptions, and is at a constant level and has no seasonal variations in time series for short-term predictions.

The simple exponential smoothing method provides a way to estimate the level at the current point in time. To more accurately estimate the current time level, we use the Alpha parameter to control the smoothing, and the alpha value is between 0-1. As alpha approaches 0, the observed values of the near predictions are less weighted in the forecast.

We use London from 1813 to 1912 for all the annual rainfall per foot to do the analysis object, first read the relevant data and draw a sequence diagram:

Rain <-Scan ("Http://robjhyndman.com/tsdldata/hurst/precip1.dat", skip=1) rainseries <-ts (rain,start=c (1813)) Plot.ts (rainseries)

It can be seen from this graph that the entire curve is at roughly the same level, and that random changes can be thought to be roughly constant over the entire time series, so the sequence can be described roughly as an additive model, so we can use the simple exponential smoothing method to predict. We use the Holtwinters () function in R, in order to be able to use the exponential smoothing in holtwinters, we need to set parameters: Beta=false and Gamma=false, and predict the results such as:

Holtwinters () Tell us that the Alpha parameter is estimated to be about 0.024, very close to 0, indicating that the sequence is relatively smooth, by default holtwinters will only give the original time series covered by the prediction of the period, the predicted value exists in the house named fitted variable, we can through rainseriesforecasts $fitted to get these values.

In addition, we can draw the predicted value and the actual value to see the forecast effect:

Plot (rainseriesforecasts)

From the previous alpha and, we can see that our predictions are too smooth, and R provides a sample prediction error squared sum (SSE) to measure the predictive effect. This value can be obtained by Rainseriesforecasts$sse. In addition, we can specify the initial value of the time series level, the common practice is to take the first value of the time series, such as the use of 1813 years of 23.56, specifically implemented as follows:

Holtwinters (Rainseries, Beta=false, Gamma=false, l.start=23.56)

To predict the value of the future period, we need to use the forecast package, assuming that we need to predict the amount of rain over the next 8 years, specifically as follows:

Library ("forecast") rainseriesforecasts2 <-forecast. Holtwinters (rainseriesforecasts, h=8)

At the same time forecast. The Holtwinters () function gives a prediction interval of 80% and 95%, in order to make it easier to see the results of the predictions, we draw the predictions, the concrete implementation and the results are as follows:

Plot.forecast (RAINSERIESFORECASTS2)

The Blue line is forecast 1913-1920 rainfall, the dark gray shadow area is 80% of the prediction interval, the light gray shaded area is 95% of the prediction interval.

The same forecast also provides a statistical indicator of "predictive error" (residuals), and with "predictive error" we can assess the likelihood of a prediction being improved: If the prediction error is relevant, it is likely that a simple exponential smoothing prediction can be optimized by another predictive technique. In order to verify the correlation of the Prediction Error (Act ()), we obtain the correlation diagram of the 1-20 order (period) in the sample error, and the implementation and results of the specific r are as follows:

ACF (Rainseriesforecasts2$residuals, lag.max=20) #可以使用lag. Max Specifies the maximum number of orders we want to see

Watch can find that the autocorrelation coefficient reached the confidence limit in the 3rd period. In order to verify that the non-0 autocorrelation attribute is significant at 1-20-step lag, the R language (Box.test ()) Ljung-box test can be used. The specific implementation is as follows:

The statistic is 17.4, and the P value is 0.6 (confidence is only 40%) so the value is not sufficient to reject the "predictive error in the 1-20-order non-zero autocorrelation

It is proved that the prediction error is non-zero autocorrelation in the 1-20 order.

The Ps:p value is large, indicating a purely random sequence. P-Value small, non-pure random sequence, confidence level (1-P)

In addition to the above verification method, we can also verify that the prediction error is positive distribution, and the mean value is 0, the variance is constant, in order to achieve this, we can draw a sample prediction error plot:

Plot.ts (Rainseriesforecasts2$residuals)

Observation, we can conclude that the prediction error in the entire time interval is roughly the same as the normal distribution, and the mean is close to 0, in order to more specific display, we need to use a small amount of code, first build function plotforecasterrors:

Plotforecasterrors <-Function (forecasterrors) {# Make a red histogram of the forecast errors:mysd <-SD (forecaste Rrors) hist (forecasterrors, col= "Red", Freq=false) # Freq=false ensures the area under the histogram = # generate Normall Y distributed data with mean 0 and standard deviation mysdmynorm <-rnorm (10000, mean=0, SD=MYSD) myhist <-hist (Myn ORM, Plot=false) # Plot The normal curve as a blue line on top of the histogram of forecast errors:points (Myhist$mids, MYH Ist$density, type= "L", col= "Blue", lwd=2)}

Call the function in the console:

Source ("PLOTFORECASTERRORS.R") plotforecasterrors (rainseriesforecasts2$residuals)

Results

It can be seen that the predicted error has a mean value of 0, the variance is a constant normal distribution, and the visible prediction algorithm cannot be improved.

Today and the small partners to share here, I hope to help you, the next continuation of the exponential smooth method of the hall

R Language and Data Analysis VII: Simple exponential smoothing of time series

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

R Language and Data Analysis VII: Simple exponential smoothing of time series

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

R Language and Data Analysis VII: Simple exponential smoothing of time series

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support