Analysis of time series prediction using LSTM model in Python __python

Source: Internet
Author: User
Tags keras
Time Series Model

Time Series Prediction Analysis is to use the characteristics of an event time over a period of time to predict the characteristics of the event in the future. This is a kind of relatively complex prediction modeling problem, and the regression analysis model is different from the prediction, time series model is dependent on the sequence of events, the same size of the value change Order after the input model produces different results.
a chestnut: a week of stock price changes based on the daily stock price of a stock over the past two years; RNN and LSTM models , based on the number of people who want to spend next week on a store per week in the past 2 years.

The most powerful tool used in the time series model is the recursive neural network (recurrent neural Network, RNN). Compared with the independent characteristics of the common neural network, the results of each hidden layer of RNN are related to the current input and the previous hidden layer results. By this method, the results of the RNN calculation have the characteristics of the previous results of memory.

The typical RNN network structure is as follows:

The right side is the structure that is produced to facilitate the understanding of memory while calculating. Simply put, x is the input layer, O is the output layer, S is the hidden layer, and T is the first time calculation; V,w,u is the weight, in which the hidden layer state of T times is computed as St = f (u*xt + w*st-1) to achieve the purpose of hooking the current input result with the previous calculation. For RNN want a deeper understanding can poke here .

Limitations of RNN:
Since the RNN model needs to realize long-term memory, it is necessary to hook up the current implicit state calculation with the first n calculation. That is, St = f (u*xt + w1*st-1 + w2*st-2 + ... + wn*st-n), in which case the amount of computation will increase exponentially, resulting in a significant rise in model training time, Therefore, the RNN model is generally used directly for long-term memory calculation.

The

lstm model
lstm (Long short-term Memory) model is a RNN variant that was first proposed by Juergen Schmidhuber. The classic LSTM model structure is as follows: The

Lstm is characterized by the addition of valve nodes in various layers outside the RNN structure. Valves are available in 3 categories: forgotten Valves (Forget gate), Input valves (input gate) and output valves (at output gate). These valves can be turned on or off and used to determine whether the memory of the Model network (the state of the previous network) has reached a threshold for the results of the output from that layer and is added to the current calculation of the layer. As shown in the figure, the valve node uses the sigmoid function to compute the memory state of the network as input; If the output reaches a threshold, it multiplies the output of the valve from the computed result of the current layer as the input at the next level ( PS: This multiplication is multiplied by the element in the matrix If the threshold value is not reached, the output is forgotten. Each layer, including the weight of the valve node, will be updated in the training process for each model reverse propagation. A more specific LSTM judgment calculation process is shown in the following illustration:

The memory function of the LSTM model is implemented by these valve nodes. When the valve is open, the training result of the previous model is associated with the current model calculation, and the calculation before the valve closes will no longer affect the current calculation. Therefore, by adjusting the valve switch, we can realize the effect of the early sequence on the final result. And when you don't want to have an effect on the previous results, such as the beginning of analyzing new paragraphs or chapters in natural language processing, turn off the valve. (for Lstm want a more specific understanding can be poked here)
The following illustration shows how the valve works: The variable of the input of sequence 1th is affected by the valve control to the variable calculation result of sequence 4th, 6. The

Black solid circle represents the output of the computed result of the node to the next or next calculation; The hollow circle indicates that the node's calculations were not entered to the network or were not received from the last signal. Implement the LSTM model in Python

There are a number of packages in Python that can be called directly to build lstm models, such as Pybrain, Kears, TensorFlow, cikit-neuralnetwork, etc. (more stamp here ). Here we choose keras. PS: If the operating system with Linux or Mac, strong push TensorFlow ... )

Because the training of LSTM neural network model can be optimized by adjusting many parameters, such as activation function, LSTM layer, input and output variable dimension and so on, the regulation process is quite complex. Here is just one of the simplest examples of application to describe the LSTM process. Application Examples

Based on the historical consumption of a customer in a store, the time of the customer's next visit to the store is speculated. The specific data is shown below:

Consumption time
2015-05-15 14:03:51
2015-05-15 15:32:46
2015-06-28
18:00:17 2015-07-16 21:27:18 2015-07-16 22:04:51
2015-09-08 14:59:56 ...
.

Specific actions:
1. Conversion of raw data

First you need to value the Point-in-time data. It is more commonly used to convert the time period to the time interval used to represent the two times consumed by the user and then import the model for training. The transformed data is as follows:

Consumption interval of
0
0 ...
.

2. Generate Model Training DataSet (determine the length of the training Set's window)
The window here refers to the need for several consumption intervals to predict the next consumption interval. Here we first use the window length of 3, that is, using t-2, t-1,t interval for the model training, and then use t+1 interval to verify the results. The dataset format is as follows: X is the training data, Y is the validation data.
PS: This is not appropriate to determine, because the length of the window needs to be adjusted according to the model validation results.

X1    X2    X3    Y
0    0 0 ...
.    

Note: Direct such prediction of the general accuracy will be poor, you can put the predicted value of y according to the number of categories, and then converted to One-hot tags to training will be better. For example, if you divide y into five categories (1:0-20,2:20-40,3:40-60,4:60-80,5:80-100) on a range of values, you can translate:

X1    X2    X3    Y
0    0    0    4

Y is converted into one-hot (refer to the One-hot code here)

1    0    0    0    0
0 0 0 0 1 ...

3. Network model structure of the determination and adjustment
Here we use Python's Keras library. (with Java students can refer to the next deeplearning4j this library). The training process of the network is designed to adjust many parameters: for example, the activation function (activation fucntion) of the Lstm module needs to be determined (Tanh is the default in Keras), and the fully connected artificial neural network for receiving LSTM output is determined (fully-connected Artificial neural network) activation function (linear by default in Keras); Determine the discard rate for each layer of network nodes (in order to prevent excessive fitting (overfit)), here Our default value is set to 0.2; Determine how the error is calculated, Here we use the mean square error (mean squared error) and determine the iterative update of the weighting parameters, where we use the Rmsprop algorithm, which is commonly used for RNN networks. Determine the epoch and batch size of the model training (specific explanations of these two parameters on the model are stamped here )
Generally speaking, the more layers of the Lstm module, the stronger the learning ability for the higher time; at the same time, an ordinary neural network layer is added to the dimensionality of the output. The typical structure is as follows:

If you need to train multiple sequences in the same model, you can enter the sequence into a separate lstm module and then the output is merged into the normal layer. The structure is as follows:

4. Model Training and results prediction
The data set is randomly divided into training sets and validation sets by 4:1, which is to prevent excessive fitting. Training model. The predicted value can be obtained by importing the X column of the data as a parameter, and the model can be obtained by comparing the actual Y value. Implement code time interval sequence format to the required training set format

Import pandas as PD
import NumPy as NP

def create_interval_dataset (DataSet, Look_back): "" "
    :p Aram Dataset:input array of time intervals
    :p Aram Look_back:each Training Set feature length
    : Return:convert an Arra Y of values into a dataset matrix.
    "" " datax, Datay = [], [] for
    I in range (len (DataSet)-Look_back):
        datax.append (Dataset[i:i+look_back])
        Datay.append (Dataset[i+look_back]) return
    Np.asarray (datax), Np.asarray (datay)

df = Pd.read_csv (" Path-to-your-time-interval-file ")    
Dataset_init = Np.asarray (DF)    # If only 1 column
datax, Datay = Create _interval_dataset (DataSet, lookback=3)    # Look back if the training set sequence length

Here the input data source is a CSV file, if the input data is from the database can refer to here LSTM Network structure

Import pandas as PD import NumPy as NP import random from keras.models import sequential, Model_from_json from Keras.layer s import dense, lstm, dropout Class Neuralnetwork (): Def __init__ (self, **kwargs): "" ":p Aram **kwarg S:output_dim=4:output dimension of lstm layer; Activation_lstm= ' tanh ': activation function for lstm layers; Activation_dense= ' relu ': activation function for dense layer; Activation_last= ' sigmoid ': activation function for last layer; Drop_out=0.2:fraction of input units to drop; NP_EPOCH=10, the number of epoches to train the model. Epoch is one forward pass and one backward the training examples; Batch_size=32:number of samples per gradient update. The higher the batch size, the more memory spaces you ll need; Loss= ' mean_square_error ': Loss function; Optimizer= ' Rmsprop ' "" "Self.output_dim = Kwargs.get (' Output_dim ', 8) self.activation_lstm = Kwarg S.get (' activation_lstm ', ' Relu ') self.activation_dense = Kwargs.get (' activation_dense ', ' relu ') self.activation_last = Kwargs.get (' activation_last ', ' Softmax ') # SOF Tmax for multiple Output Self.dense_layer = Kwargs.get (' Dense_layer ', 2) # at least 2 layers self.lstm _layer = Kwargs.get (' Lstm_layer ', 2) self.drop_out = Kwargs.get (' drop_out ', 0.2) Self.nb_epoch = kwargs.ge T (' Nb_epoch ', ten) self.batch_size = Kwargs.get (' batch_size ', MB) Self.loss = Kwargs.get (' loss ', ' Categori  Cal_crossentropy ') Self.optimizer = kwargs.get (' Optimizer ', ' Rmsprop ') def nn_model (self, trainx, Trainy,
        TESTX, testy): "" ":p Aram trainx:training Data set:p Aram Trainy:expect value of training data
        :p Aram Testx:test Data set:p Aram Testy:epect value of test Data:return:model after training
        "" "Print" training model is lstm network! " Input_dim = trainx[1].shape[1] Output_dim = trainy.shape[1] # one-hot LAbel # Print predefined parameters of current Model:model = sequential () # Applying a lstm layer With x Dim output and Y Dim input. Use dropout parameter to avoid overfitting Model.add (lstm (Output_dim=self.output_dim, input
                       _dim=input_dim, Activation=self.activation_lstm, Dropout_u=self.drop_out, return_sequences=true)) for I in Range (self.lstm_layer-2): Model.add (Lstm (output_ Dim=self.output_dim, Input_dim=self.output_dim, activation=self.activation_l STM, Dropout_u=self.drop_out, return_sequences=true)) # argument RET Urn_sequences should is false in the last LSTM layer to avoid input dimension incompatibility with dense layer DD (lstm (Output_dim=self.output_dim, Input_dim=self.output_dim, ACTivation=self.activation_lstm, dropout_u=self.drop_out)) for I in range (self.dense_layer-1)
        : Model.add (Dense (Output_dim=self.output_dim, activation=self.activation_last)) Model.add (Dense (Output_dim=output_dim, Input_dim=self.output_dim, Activa tion=self.activation_last)) # Configure the learning process Model.compile (Loss=self.loss, optimizer=self. Optimizer, metrics=[' accuracy ') # Train the model with fixed number of epoches Model.fit (X=trainx, Y=trai 
        NY, Nb_epoch=self.nb_epoch, Batch_size=self.batch_size, Validation_data= (TESTX, testy)) # Store model to JSON file Model_json = Model.to_json () with open (Model_path, "w") as Json_file:json_file.write (Model_j Son) # Store model weights to HDF5 file if Model_weight_path:if os.path.exists (model_weight_p ATH): OS. Remove (Model_weight_path) model.save_weights (model_weight_path) # EG:MODEL_WEIGHT.H5 return model 

This writing only involves the structure of the LSTM network, as to how to standardize the data processing into the structure of the network and the visualization of the model forecast and the actual value, it needs to be adjusted according to the actual situation. The specific script can refer to the following this

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.