1. Self-return
As before, the analysis of time series and regression, the purpose is to predict. In the return, we have a return to the multivariate regression, in the time series, we have the autoregressive. Like a dollar and a plurality, we are divided into first-order and Doge-Self regression. In fact, it is the same idea, but before the variable and the strain, now there is the relationship between the sequence of time lag.
Let's take a look at the first-order autoregressive ar (1), i.e., Yt=b*yt-1+ut.
In the beginning we discussed the importance of the stability of time series, then whether the time series formed by such a first-order autoregressive can satisfy the smoothness. The answer is that this first-order autoregressive sequence is stable when the absolute value of the autoregressive coefficient is less than 1.
Now, let's construct a time series that satisfies the first-order autoregressive. 2. First order autoregressive sequence production
We're going to generate a time series whose autoregressive equation is as follows:
YT = 0.8 * Yt-1 + C
Where c is the residuals, we use white noise, that is, normal distribution to express. The specific R code is as follows:
#example 5
set.seed (1234) #设置随机种子
n = 50# sequence quantity
y1 = Rep (0,n); #初始化y1时间序列 for
(T in 2:n) {#根据自回归方程计算y1序列
y1[t] = 0.8*y1[t-1] + rnorm (1)
}
Plot (Y1,type = ' o ') #绘制序列图
We can actually use the R language built-in functions to quickly complete the generation of regression sequences:
#example 6
y1 = Arima.sim (n = 50,list (AR = 0.8)) #R中自带函数, list is the autoregressive coefficient of each order, since we have only one-order autoregressive, so there is only one 0.8 plot
(Y1, Type = ' O ')
And then we look at the graph of the autocorrelation coefficient, very simply, as before, ACF (y1). We get the following autocorrelation diagram.
Here we can see that the first-order autocorrelation coefficient is relatively large, and our model 0.8 is also approximate parity. If we give more data, the value will be closer. Here I add that the blue dotted line in the above picture is the function. In general, we think that the autoregressive coefficients that exceed the blue dotted line are significant, which means that if we do a significant test, we tend to pass the regression coefficients that are not more than the blue dotted line, and are usually insignificant. 3.AR model Estimation
We set up a time series on our own, so if we have a sequence of time now, how do we estimate its model? In other words, how to get its autoregressive equation. Of course, we can obtain the regression coefficients of each lag according to the ACF function, and then we get a multiple-order autoregressive model, but this is not scientific, we have a more practical approach.
In fact, the AR model estimation, plainly speaking, is the linear regression coefficient of the process. In linear regression, we use the least squares method, in the AR model of time series, we introduce two kinds of yulr-walker and ols (i.e. least squares).
#example 7
ar (y1,method = "Yule-walker")
ar (y1,method = "OLS")
We can see the following results in R:
We can see that these methods will tell us how many of those orders can be chosen and what the autoregressive coefficients are.
Here, we want to know that the OLS method is not very accurate, as far as possible or use the former. 4. A demo with R function
In fact, all of our above, R language is ready, a very short code can be achieved. We use a 2-order autoregressive model as an example to illustrate.
Model: YT = 0.7 * yt-1-0.5 yt-2 + C
Again, C is the error term.
y2 = Arima.sim (n = 100,list (AR = C (0.7,-0.5)))
plot (y2,type = ' o ')
pacf (y2) $ac [1:5]
We can see the time series and its autoregressive coefficients.
Here, we want to distinguish the ACF from the PACF function, which is used for multiple-order AR, and the first line is to represent the correlation coefficient of a one-order lag, and unlike ACF, the first line represents the correlation coefficient of their own and, of course, is 1. Of course, this is only a superficial difference, and the deep distinction is shown in the 5th part of the following.
Then we estimate the model with the model estimation function in R.
Arima (Y2,order = C (2,0,0))
In this way, we can see the autoregressive coefficients. If we add Include.mean = F to a function, then there is no mean item, which is the intercept item in the display.
5.ACF and PACF
Some of the differences between ACF and PACF are mentioned earlier. In fact, the people who have studied calculus are to partial derivative, here is a similar concept.
When we use the ACF, it is found that for the first-order AR model, the correlation between the second-order lag sequence and the original sequence is also very large, because the correlation between yt-2 and YT is caused indirectly by the relativity of Yt-2 and yt-1, we don't consider this when we use ACF, and when we use PACF, is to eliminate the indirect effect of yt-1 on yt-2 to calculate the correlation between YT and yt-2.
Contrast it at a glance.
ACF (y1)
pacf (y1)
ACF's:
PACF: