The least squares method for AR model _ machine learning

Source: Internet
Author: User

AR (autoregressive) model (Autoregressive model): To predict the present or future performance of a variable by the performance of the same variable, which is related only to the variable itself and not to other variables, so called autoregressive.

Mathematical definition Model: Assuming that AR model is P-order, for a set of time series have observed value {x[1],x[2],..... x[n]}, calculate t time x Predictive value x[t], its autoregressive equation:

X[t]=a[1]*x[t-1]+a[2]*x[t-2]....+a[p]*x[t-p]+u[t],1<=p<n,p<=t<=n

where {A[1],a[2]...a[p]} is the corresponding parameter sequence, u[t] is a white noise satisfying n (0,σ^2).

It can be seen from the mathematical model that the AR (p) model is a linear prediction, and the value of T moment is predicted by the previous P x observations, which is essentially similar to the interpolation method, and its purpose is to increase the valid data.

AR model is used in the prediction and fitting of stationary time series, and given a time series, the modeling steps are as follows:

1. To determine whether the time series is stable, ACF test, ADF unit root test and other methods can be used.

2 If the time series is stable, the direct turn 3, if the time series is non-stationary, then can use the difference method, converts it to the stationary time series, turns 3.

3. Calculate AR model parameters (Burg algorithm, least squares method, autocorrelation algorithm, etc.) and order (according to AIC Criterion, SC criterion, fpe criterion, etc.).

4. The fitting degree of AR model is determined in the test 3, the main test is whether the residual sequence obeys n (0,σ^2) white noise.

5. Using AR model to predict.

The following example is used to parse the modeling process:

The current mortality rate for the 1978-2014 national population (data from http://www.stats.gov.cn/tjsj/ndsj/):

[6.25 6.28 6.34 6.36 6.60 6.90 6.82 6.78

6.86 6.72 6.64 6.54 6.67 6.70 6.64 6.64

6.49 6.57 6.56 6.51 6.50 6.46 6.45 6.43

6.41 6.40 6.42 6.50 6.81 6.93 7.06 7.08

7.11 7.14 7.15 7.16 7.16]


1. Determine whether it is a stationary sequence
Set mean (x), VAR (x) is the mean and variance of the sequence {x} respectively, and determine whether it is a stationary sequence according to the autocorrelation coefficient ACF:
The ACF formula for sample {x} is: Acf=∑ (X[i]-mean (x)) * (X[i+k]-mean (x))/(N*var (x)), 0<=k<n,0<=i<n-k
The Python code is as follows:

Import NumPy;
Import Math;
#计算某一个k值的ACF
def auto_relate_coef (data,avg,s2,k):
    ef=0.;
    For I in range (0,len (data)-K):
	ef=ef+ (DATA[I]-AVG) * (DATA[I+K]-AVG);
    Ef=ef/len (data)/s2;
    return EF;
#计算k从0到N-1 All ACF
def auto_relate_coefs (sample):
    efs=[];
    Data=[];
    Avg=numpy.mean (sample);
    S2=numpy.var (sample);
    Array=sample.reshape (1,-1);
    For x in Array.flat:
	data.append (x);
    for k in range (0,len (data)):
	Ef=auto_relate_coef (data,avg,s2,k);
	Efs.append (EF);
    return EFS;
Sequence {1978-2014 population mortality} autocorrelation coefficient as shown:

For stationary time series, the rate of ACF attenuation to 0 with the increase of k value is faster than non-stationary random sequence. Based on this, we can see that the sequence {1978-2014 population mortality} is stable.

Calculation and order of 2.AR model parameters
The predicted value {Y[p],y[p+1]....y[n]} can be obtained by the above AR (p) equation
Y[P+1]=A[P]*X[1]+A[P-1]*X[2]....A[1]*X[P]
Y[P+2]=A[P]*X[2]+A[P-1]*X[3]....A[1]*X[P+1]
.......
Y[N]=A[P]*X[N-P]+A[P-1]*X[N-P+1]......A[1]*X[N-1]
The above equations are written in matrix form:
Y[N-P,1]=X[N-P,P] dota[p,1]
Where [Row,col] represents the matrix of the row row col column, dot represents the matrix dot multiplication operation.
The transpose operation of X is XT, and the inverse matrix operation is XI.
According to the principle of least squares, the formula for calculating the parameters is:
a= (XT dot X) I dot Xtdoty
It is easy to get the parameter of P-order AR model and the calculation code of prediction value according to the formula:
def ar_least_square (sample,p):
    Matrix_x=numpy.zeros ((sample.size-p,p));
    Matrix_x=numpy.matrix (matrix_x);
    Array=sample.reshape (sample.size);
    j=0;
    For I in Range (0,sample.size-p):
	matrix_x[i,0:p]=array[j:j+p];
	j=j+1;
    Matrix_y=numpy.array (Array[p:sample.size]);
    Matrix_y=matrix_y.reshape (sample.size-p,1);
    Matrix_y=numpy.matrix (matrix_y);
    #fi为参数序列
    Fi=numpy.dot (Numpy.dot (Numpy.dot (matrix_x.t,matrix_x)). i,matrix_x.t), matrix_y);
    Matrix_y=numpy.dot (MATRIX_X,FI);
    Matrix_y=numpy.row_stack ((Array[0:p].reshape (p,1), matrix_y));
    return fi,matrix_y;

Know how to calculate the parameters is not enough, but also for the AR model to choose an optimal p value, that is, the order.
The general steps of order are:
(1) to determine the upper limit of P value, which is generally the ratio of the sequence length n or the lnn multiples.
(2) if the Max (p) value is not exceeded, the optimal p is determined from 1 according to a certain principle;
In this example, I set the upper limit of p values to n/2=18, and the order of magnitude criterion with AIC (minimum information criterion) and SC (Schwartz criterion), the smaller the estimate based on two criteria, the better the rank.
Aic=2*p+n*ln (σ^2) sc=p*ln (N) +n*ln (σ^2)
Σ^2 is the variance of residuals between observed and predicted values.
def ar_aic (rss,p):  n=rss.size;  S2=numpy.var (RSS);
return  2*p+n*math.log (S2);

def ar_sc (rss,p):  n=rss.size;  S2=numpy.var (RSS);
return  P*math.log (n) +n*math.log (S2);

AIC and SC in this example:


We can see that when the p=18, both the AIC and the SC are the smallest, and the values of AIC and SC change greatly when p=19.
Look at the fitting effect of the AR (p) model at p=10, 18 and 19 (the Red solid line is the observed value and the blue dash is the predicted value).
P=10:

P=18

P=19

Three picture can be seen intuitively p=18 time, AR (18) of the best fitting effect, almost identical. AR (10) Although the effect is not as AR (18), but the disturbance in the acceptable range, AR (19) Simply bereavement, deviation too much.

3. Fitting test
The AR equation is changed into the following formula:
U[t]=x[t]-a[1]*x[t-1]-a[2]*x[t-2]-....-A[p]*x[t-p]
If U[T] is a white noise that obeys N (0,σ^2), AR (p) can be considered an acceptable model.
In this example, the residual difference u[t] calculated by AR (18), the mean is 1.06*10^-6 and the variance is 4.2*10^-4
U[t] of the autocorrelation coefficient as shown:

The residual error approximately obeys N (0,σ^2), so AR (18) can be used to fit and predict.

Summarize:
This example uses the least squares method to calculate the AR model parameters, and the AR (18) model has a good effect, the disadvantage is that the least squares algorithm involves a large number of matrix point multiplication, which is time-consuming. Not only AR models, but also Ma,arma,arima models can be used to fit and predict stationary time series, the modeling steps are basically consistent, and the results are better than those of AR and Ma,arma and Arima.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.