Python uses lstm for time series analysis and prediction

Last Update:2018-07-24 Source: Internet

Author: User

Tags shuffle keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The time series (or dynamic series) refers to the sequence of the values of the same statistic index according to the chronological order of their occurrence. The main purpose of time series analysis is to predict the future based on the historical data.

Time series elements: long-term trends, seasonal change, cyclic change, irregular change, long-term trend (T) a general trend of change in the long term that is affected by a fundamental factor (S) The cyclical change of regularity in the period of a year as the Seasons change (C The irregular change of the wave form of the phenomenon taking several years as a period (I) is a kind of irregular change, including strict random change and irregular abrupt change, which has great changes in two types.

(1) Raw time series data (only 18 rows are listed)

1455.219971
1399.420044
1402.109985
1403.449951
1441.469971
1457.599976
1438.560059
1432.25
1449.680054
1465.150024
1455.140015
1455.900024
1445.569946 1441.359985
1401.530029
1410.030029
1404.089966
1398.560059

(2) processing data to meet the requirements of lstm

For a more intuitive understanding of the data format, some printing (print) is added to the code, and a comment is added, which is the output value

def load_data (filename, seq_len): F = open (filename, ' RB '). Read () data = F.split (' \ n ') print (' Data len: ', Len ( Data) #4172 print (' Sequence len: ', seq_len) #50 sequence_length = seq_len + 1 result = [] for I Ndex in range (len (data)-Sequence_length): Result.append (Data[index:index + sequence_length]) #得到长度为seq_len +1 to

    Quantity, the last one as label print (' Result len: ', len (result) #4121 print (' Result shape: ', Np.array (Result). Shape) # (4121,51) result = Np.array (result) #划分train, test row = round (0.9 * result.shape[0]) train = Result[:row,:] n  P.random.shuffle (train) X_train = train[:,: -1] Y_train = train[:,-1] x_test = Result[row:,: -1] y_test = Result[row:,-1] x_train = Np.reshape (X_train, (x_train.shape[0), x_train.shape[1], 1)) X_test = Np.reshape (x_te St, (X_test.shape[0], x_test.shape[1], 1) print (' X_train shape: ', X_train.shape) # (3709L, 50L, 1L) print (' Y_tr Ain shape: ', Y_train.shape) # (3709L,) print (' X_test shape: ', X_test.shape) # (412L, 50L, 1L) print (' Y_test shape: ', Y_test.shape) # (412 L,) return [X_train, Y_train, X_test, Y_test]

(3) LSTM model

This article uses the Keras depth study frame, the reader may use is other, like Theano, TensorFlow and so on, the similar.

Keras LSTM Official Document

LSTM's structure can be customized, Stack lstm or bidirectional lstm

def build_model (layers):  #layers [1,50,100,1]
    model = sequential ()

    #Stack lstm
    model.add (lstm (input _dim=layers[0],output_dim=layers[1],return_sequences=true))
    Model.add (Dropout (0.2))

    Model.add (LSTM ( Layers[2],return_sequences=false))
    Model.add (Dropout (0.2))

    Model.add (Dense (output_dim=layers[3
    )) Model.add (Activation ("linear"))

    start = Time.time ()
    model.compile (loss= "MSE", optimizer= "Rmsprop")
    Print ("Compilation Time:", Time.time ()-start) return
    model

(4) LSTM Training forecast

1. Direct forecast

def predict_point_by_point (model, data):
    predicted = model.predict (data)
    print (' Predicted shape: ', Np.array ( predicted). Shape)  # (412l,1l)
    predicted = Np.reshape (predicted, (Predicted.size,)) return
    predicted

2. Rolling forecast

def predict_sequence_full (model, data, window_size):  #data x_test
    curr_frame = data[0]  # (50L,1L)
    predicted = []
    for I in Xrange (len (data)):
        #x = Np.array ([[[1],[2],[3]], [[[4],[5],[6]]]  x.shape (2, 3, 1) x [0,0] = Array ([1])  X[:,np.newaxis,:,:].shape  (2, 1, 3, 1)
        predicted.append (Model.predict (curr_frame[ Newaxis,:,:]) [0,0])  #np. Array (Curr_frame[newaxis,:,:]). Shape (1l,50l,1l)
        curr_frame = curr_frame[1:]
        curr_frame = Np.insert (Curr_frame, [window_size-1], predicted[-1], axis=0)   #numpy. Insert (arr, obj, values, Axis=none) return
    predicted

2. Sliding window + Rolling forecast

def predict_sequences_multiple (model, data, Window_size, Prediction_len):  #window_size = Seq_len
    prediction_ Seqs = [] for
    i in Xrange (len (data)/prediction_len):
        curr_frame = Data[i*prediction_len]
        predicted = [] For J-in
        xrange (prediction_len):
            predicted.append (Model.predict (Curr_frame[newaxis,:,:]) [0,0])
            curr_ frame = curr_frame[1:]
            curr_frame = Np.insert (Curr_frame, [window_size-1], predicted[-1], axis=0)
        Prediction_ Seqs.append (predicted) return
    Prediction_seqs

(5) Complete code

Sample DataSet Sp500.csv

#-*-Coding:utf-8-*-from __future__ import print_function import time import warnings import NumPy as NP import time Import Matplotlib.pyplot as Plt from numpy import Newaxis to Keras.layers.core import dense, activation, dropout from K Eras.layers.recurrent Import lstm from keras.models import sequential warnings.filterwarnings ("Ignore") def Load_data (f Ilename, Seq_len, Normalise_window): F = open (filename, ' RB '). Read () data = F.split (' \ n ') print (' Data len: ', L En (data) print (' Sequence len: ', seq_len) sequence_length = seq_len + 1 result = [] for index in range (len

    (data)-Sequence_length): Result.append (Data[index:index + sequence_length]) #得到长度为seq_len +1 vector, the last one as a label Print (' Result len: ", Len (Result)) print (" Result shape: ", Np.array (Result). Shape) print (result[:1]) if Normali Se_window:result = normalise_windows (Result) print (Result[:1]) print (' normalise_windows result shape: ', n

 P.array (Result). Shape)   result = Np.array (result) #划分train, test row = round (0.9 * result.shape[0]) train = Result[:row,:] NP. Random.shuffle (train) X_train = train[:,: -1] Y_train = train[:,-1] x_test = Result[row:,: -1] y_test = r Esult[row:,-1] x_train = Np.reshape (X_train, (x_train.shape[0), x_train.shape[1], 1)) X_test = Np.reshape (x_test , (X_test.shape[0], x_test.shape[1], 1) return [X_train, Y_train, X_test, Y_test] def normalise_windows (window_data ): Normalised_data = [] for window in Window_data: #window shape (sequence_length L,) that is (51L,) normalis Ed_window = [((Float (p)/float (window[0])-1) for P in window] Normalised_data.append (Normalised_window) re Turn Normalised_data def build_model (layers): #layers [1,50,100,1] model = sequential () Model.add (Lstm (Input_di m=layers[0],output_dim=layers[1],return_sequences=true)) Model.add (Dropout (0.2)) Model.add (LSTM (Layers[2],return _SEQUENCES=FALSE)) MoDel.add (Dropout (0.2)) Model.add (dense (output_dim=layers[3)) Model.add (Activation ("linear")) Start = Time.ti Me () model.compile (loss= "MSE", optimizer= "Rmsprop") Print ("Compilation Time:", Time.time ()-start) return m Odel #直接全部预测 def predict_point_by_point (model, data): predicted = model.predict (data) print (' Predicted shape: ', NP
. Array (predicted). Shape) # (412l,1l) predicted = Np.reshape (predicted, (Predicted.size,)) return predicted #滚动预测
    def predict_sequence_full (model, data, window_size): #data x_test curr_frame = data[0] # (50l,1l) predicted = [] For i in Xrange (len (data)): #x = Np.array ([[[1],[2],[3]], [[4],[5],[6]]] X.shape (2, 3, 1) x[0,0] = Array ([1 ] X[:,np.newaxis,:,:].shape (2, 1, 3, 1) predicted.append (Model.predict (Curr_frame[newaxis,:,:)) [0,0]) #np. arr Ay (Curr_frame[newaxis,:,:]). Shape (1l,50l,1l) Curr_frame = curr_frame[1:] curr_frame = Np.insert (curr_fram E, [window_size-1], predIcted[-1], axis=0) #numpy. Insert (arr, obj, values, Axis=none) return predicted def predict_sequences_multiple (model , data, Window_size, Prediction_len): #window_size = Seq_len prediction_seqs = [] for i in Xrange (len (data)/predi
            Ction_len): Curr_frame = Data[i*prediction_len] predicted = [] for J in Xrange (Prediction_len):
            Predicted.append (Model.predict (Curr_frame[newaxis,:,:)) [0,0]) Curr_frame = curr_frame[1:]
    Curr_frame = Np.insert (Curr_frame, [window_size-1], predicted[-1], axis=0) prediction_seqs.append (predicted)
    Return prediction_seqs def plot_results (predicted_data, true_data, filename): Fig = plt.figure (facecolor= ' white ')
    Ax = Fig.add_subplot (Ax.plot) (True_data, label= ' true Data ') Plt.plot (predicted_data, label= ' prediction ') Plt.legend () plt.show () plt.savefig (filename+ '. png ') def plot_results_multiple (Predicted_data, True_data, Predic Tion_len): Fig = PLT. Figure (facecolor= ' white ') ax = Fig.add_subplot Ax.plot (true_data, label= ' true data ') #Pad the list of PR Edictions to shift it into the graph to it's correct start for I, data in enumerate (predicted_data): padding = [ None for P in xrange (i * Prediction_len)] Plt.plot (padding + data, label= ' prediction ') plt.legend () pl T.show () plt.savefig (' plot_results_multiple.png ') if __name__== ' __main__ ': Global_start_time = Time.time () EP Ochs = 1 Seq_len = print (' > Loading data ... ') X_train, Y_train, x_test, y_test = Load_data (' sp500.csv ', Seq_len, True) print (' X_train shape: ', X_train.shape # (3709L, 50L, 1L) print (' Y_train shape: ', Y_train.shape) # (3709L,) print (' X_test shape: ', X_test.shape) # (412 L, 50L, 1L) print (' Y_test shape: ', Y_test.shape) # (412L,) print (' > Data Loaded. Compiling ... ') model = Build_model ([1, M, 1]) Model.fit (X_TRAIN,Y_TRAIN,BATCH_SIZE=512,NB_EPOCH=EPochs,validation_split=0.05) Multiple_predictions = predict_sequences_multiple (model, X_test, Seq_len, Prediction_len =50) print (' Multiple_predictions shape: ', Np.array (multiple_predictions). Shape) # (8l,50l) full_predictions = Pre Dict_sequence_full (model, x_test, Seq_len) print (' Full_predictions shape: ', Np.array (full_predictions). Shape) # (412L ,) Point_by_point_predictions = Predict_point_by_point (model, x_test) print (' Point_by_point_predictions shape: ', n P.array (point_by_point_predictions). Shape) # (412L) print (' Training duration (s): ', Time.time ()-Global_start_time ) Plot_results_multiple (multiple_predictions, Y_test) plot_results (full_predictions,y_test, ' full_predictions ') ) Plot_results (point_by_point_predictions,y_test, ' point_by_point_predictions ')

References
(1) https://github.com/jaungiers/LSTM-Neural-Network-for-Time-Series-Prediction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More