This article will explain how to use lstm to predict the time series, focusing on the application of lstm, the principle part can refer to the following two articles:
Understanding lstm Networks Lstm Learning Notes
Programming Environment: Python3.5,tensorflow 1.0
The data set used in this paper comes from the Kesci platform, which is provided by the cloud Brain machine learning Combat Training camp: The time series prediction Challenge of real business data
The dataset uses a series of related time series (about 40 groups) and an external feature time sequence (about 5 groups) from the industry. This article uses only one set of data for modeling.
To load common libraries:
#加载数据分析常用库
Import Pandas as PD
import numpy as NP
import TensorFlow as TF from
sklearn.metrics import mean _absolute_error,mean_squared_error from
sklearn.preprocessing import minmaxscaler
import Matplotlib.pyplot As plt
% matplotlib inline
import warnings
warnings.filterwarnings (' Ignore ')
Data display:
Path = '.. /input/industry/industry_timeseries/timeseries_train_data/11.csv '
data11 = pd.read_csv (path,names=[' year ', ' Month ', ' Day ', ' the highest temperature of the day ', ' the lowest temperature of the day ', ' average temperature of the day ', ' mean humidity ', ' output ']
data11.head ()
The day of the maximum temperature of the day, the daily average humidity output 0 2015 2 1 1.9-0.4 0.7875 75.000 814.155800 1 2015 2 2 6.2-3.9 1.7625 77.250 704.25 1112 2 2015 2 3 7.8 2.0 4.2375 72.750 756.958978 3 2015 2 4 8.5-1.2 3.0375 65.875 640.645401 4 2015 2 5 7.9-3.6 1.8625 5 5.375 631.725130
Load data:
# #load data (this article takes the first table as an example, the other tables are similar, no longer repeat)
f=open (' ... /input/industry/industry_timeseries/timeseries_train_data/11.csv ')
df=pd.read_csv (f) #读入数据
data= Df.iloc[:,3:8].values #取第3-7 columns
Define constants and initialize weights:
#定义常量
rnn_unit=10 #hidden layer units
input_size=4
output_size=1
lr=0.0006 #学习率
Tf.reset_default_graph ()
#输入层, output layer weights, bias
weights={
' in ': TF. Variable (Tf.random_normal ([Input_size,rnn_unit])),
' out ': TF. Variable (Tf.random_normal ([rnn_unit,1]))
}
biases={
' in ': TF. Variable (Tf.constant (0.1,shape=[rnn_unit,]),
' out ': TF. Variable (Tf.constant (0.1,shape=[1,])
}
Split data sets, dividing data into training sets and validation sets (last 90 days for validation, others for training):
def get_data (batch_size=60,time_step=20,train_begin=0,train_end=487): batch_index=[] Scaler_for_x=MinMaxS Caler (feature_range= (0,1)) #按列做minmax缩放 Scaler_for_y=minmaxscaler (feature_range= (0,1)) scaled_x_data=scaler_for_x . Fit_transform (Data[:,:-1]) scaled_y_data=scaler_for_y.fit_transform (data[:,-1]) Label_train = scaled_y_data[ Train_begin:train_end] Label_test = scaled_y_data[train_end:] Normalized_train_data = Scaled_x_data[train_begin:tr Ain_end] Normalized_test_data = scaled_x_data[train_end:] train_x,train_y=[],[] #训练集x和y初定义 for I in Ran GE (len (normalized_train_data)-time_step): If I% batch_size==0:batch_index.append (i) x=normal
Ized_train_data[i:i+time_step,:4] Y=label_train[i:i+time_step,np.newaxis] Train_x.append (X.tolist ()) Train_y.append (Y.tolist ()) Batch_index.append ((Len (normalized_train_data)-time_step)) size= (Len (normalized _test_data) +time_step-1)Time_step #有size个sample test_x,test_y=[],[] for I in range (size-1): X=normalized_test_data[i*time_st EP: (i+1) *time_step,:4] Y=label_test[i*time_step: (i+1) *time_step] Test_x.append (X.tolist ()) test_y. Extend (y) test_x.append (normalized_test_data[(i+1) *time_step:,:4]). ToList ()) Test_y.extend (label_test[(i+1) * Time_step:]). ToList ()) return batch_index,train_x,train_y,test_x,test_y,scaler_for_y
Define the network structure of the LSTM:
# —————————————————— define neural network variables ——————————————————
def lstm (x):
batch_size=tf.shape (x) [0]
time_step= Tf.shape (X) [1]
w_in=weights[' in ']
b_in=biases[' in ']
input=tf.reshape (x,[-1,input_size)) # The tensor must be converted into 2 dimensions for calculation, and the computed result is used as the input of the hidden layer
input_rnn=tf.matmul (input,w_in) +b_in
input_rnn=tf.reshape (input_rnn,[ -1,time_step,rnn_unit]) #将tensor转成3维, as input to the Lstm cell
Cell=tf.contrib.rnn.basiclstmcell (rnn_unit)
# Cell=tf.contrib.rnn.core_rnn_cell. Basiclstmcell (Rnn_unit)
init_state=cell.zero_state (batch_size,dtype=tf.float32)
Output_rnn,final_ States=tf.nn.dynamic_rnn (cell, Input_rnn,initial_state=init_state, dtype=tf.float32) #output_ RNN is the result of recording lstm per output node, final_states is the result of the last cell
Output=tf.reshape (output_rnn,[-1,rnn_unit)) #作为输出层的输入
W_ out=weights[' out ']
b_out=biases["Out"
Pred=tf.matmul (output,w_out) +b_out return
Pred,final_ States
Model Training and Prediction:
# —————————————————— Training Model —————————————————— def train_lstm (batch_size=80,time_step=15,train_begin=0,train_end=487) : X=tf.placeholder (Tf.float32, shape=[none,time_step,input_size]) Y=tf.placeholder (Tf.float32, Shape=[None,time_st Ep,output_size]) batch_index,train_x,train_y,test_x,test_y,scaler_for_y = Get_data (batch_size,time_step,train_ Begin,train_end) pred,_=lstm (X) #损失函数 Loss=tf.reduce_mean (Tf.square (Tf.reshape (pred,[-1))-tf.reshape (Y, [1])) ) Train_op=tf.train.adamoptimizer (LR). Minimize (loss) with TF.
Session () as Sess:sess.run (Tf.global_variables_initializer ()) #重复训练5000次 iter_time = 5000 For I in Range (Iter_time): For step in range (Len (batch_index)-1): _,loss_=sess.run ([Train_op, Loss],feed_dict={x:train_x[batch_index[step]:batch_index[step+1]],y:train_y[batch_index[step]:batch_index[step +1]}) if I% = = 0:print (' iter: ', I, ' loss: ', Loss_) ### #predict#### test_predict=[] For step in range (Len (test_x)): Prob=sess.run (pred,feed_dict={x:[test_x[ Step]}) Predict=prob.reshape (( -1)) Test_predict.extend (predict) test_pred ICT = Scaler_for_y.inverse_transform (test_predict) test_y = Scaler_for_y.inverse_transform (test_y) RMSE=NP
. sqrt (Mean_squared_error (test_predict,test_y)) Mae = Mean_absolute_error (y_pred=test_predict,y_true=test_y) Print (' Mae: ', Mae, ' Rmse: ', Rmse) return test_predict
Call the Train_lstm () function, complete the model training and prediction process, and statistically verify the error (Mae and RMSE):
Test_predict = train_lstm (batch_size=80,time_step=15,train_begin=0,train_end=487)
Results after 5,000 iterations:
iter:3900 loss:0.000505382
iter:4000 loss:0.000502154
iter:4100 loss:0.000503413 iter:4200 loss:0.0014
0424
iter:4300 loss:0.000500015
iter:4400 loss:0.00050004 iter:4500 loss:0.000498159 iter:4600
los s:0.000500861
iter:4700 loss:0.000519379
iter:4800 loss:0.000499999 iter:4900
mae:121.183626208 rmse:162.049017904
Paint Analysis:
Plt.figure (figsize= (24,8))
Plt.plot (data[:,-1])
plt.plot ([None for _ in range (487)] + [x for x in Test_predict ])
plt.show ()
The results are as follows:
It can be seen that the LSTM model can basically predict the trend of the sequence. In order to simplify the process, this paper has not worked hard in feature engineering and parameter tuning, which is suitable for beginners to explore the application of LSTM model in time series problem. PS: The normalization of data is very important, we must ensure that the training set with the validation set specification in the same space, otherwise the results will be very poor. (I used to do Tianchi rainfall prediction when the first use is lstm, is that this step is not done well, resulting in the final results are very similar, and finally this model was I gave up.) I also encountered this problem at the beginning when I was doing this dataset, and then we solved the problem by setting the sample in the same space when normalized. The dataset provides about 45 sets of data, so we can use multi-task learning to explore the relationship between the groups of data, which I have not yet specifically understood, do not laughable. The framework of this article is derived from: tensorflow example: Using LSTM to predict the daily maximum price of a stock (II.)