Reproduced in the Daily Digest of deep learning, convlstm principle and its tensorflow realization
This document references convolutional LSTM network:a machine learning approach for
Precipitation nowcasting
Today introduced a very famous network structure--convlstm, it not only has the LSTM time series modelling ability, but also can like CNN to portray the local characteristic, can say is the spatiotemporal characteristic to have.
Lstm has made great progress in the areas of speech recognition, video analysis, sequence modeling, and the traditional lstm network consists of input gate, forget gate, cell, Output gate, hidden five modules, And the relationship between them can be represented by the following formula:
The hollow small circle in the figure represents the multiplication of the corresponding elements of the matrix, also known as the Hadamard product. This LSTM structure we can also call fc-lstm, because the internal departments are dependent on similar feedforward neural network to calculate, and this fc-lstm for time series data can be handled very well, but for spatial data, will bring redundancy, because the spatial data has a strong local characteristics, But Fc-lstm cannot portray this local feature. The CONVLSTM proposed in this paper attempts to solve this problem by replacing the input-to-state and state-to-state portions of fc-lstm with a convolution form, and the internal structure of convlstm is shown in the following figure:
As can be seen from the diagram, the connection between the input and the gate is replaced by a feedforward type convolution, and the state and state are also replaced by convolution operations. The working principle of the new convlstm can be expressed by the following formula:
Figure * denotes convolution, it is worth noting that the x,c,h,i,f,o here are three-dimensional tensor, their two dimensions represent the spatial information of rows and columns, we can think of convlstm as a model for dealing with eigenvectors in two-dimensional meshes, It can predict the characteristics of the central grid according to the characteristics of the surrounding points in the grid. So much has been introduced in the principle section, and then we are going to implement a convlstm.
But before we do that, let's take a look at the code design for the common Rnncell in TensorFlow, TensorFlow Rnncell Basicrnncell,grucell and Lstmcell, which are inherited from Rnncell , and all need to implement a common method called call (), which is called to indicate what relationship the input, state, and output are in each step of the loop.
As far as Basicrnncell is concerned, its call method simply accepts input and state, outputs the product between them and passes an activation function, the core code is as follows:
def __call__ (self, inputs, state, Scope=none): With
_checked_scope (self, scope or "Basic_rnn_cell", reuse=self._ Reuse):
output = self._activation (
_linear ([Inputs, State], self._num_units, True))
return output, output
and accordingly, the core code of Grucell is as follows:
def __call__ (self, inputs, state, Scope=none): With
_checked_scope (self, scope or "Gru_cell", Reuse=self._reuse): C2/>with Vs.variable_scope ("Gates"):
value = sigmoid (_linear (
[Inputs, State], 2 * self._num_units, True, 1.0))
r, U = Array_ops.split (
value=value,
num_or_size_splits=2,
Axis=1) with
vs.variable_scope (" Candidate "):
C = self._activation (_linear ([Inputs, R * State],
self._num_units, True))
new_h = u * State + (1-u) * C
return new_h, New_h
Baisclstmcell's core code is a little bit troublesome, because it adds a number of gates, and here the state is not just a variable, but a combination of States, in order to improve the efficiency of the matrix operation, the method taken here is the figure 1 in the four expression concatenation together to calculate the same time, After the calculation and then separate them, and then calculate the C and H, because there is no increase in the bias, and like C and I, between C and F, C and H, C and O, there is no connection between, so this Lstmcell is Basiclstmcell, And TensorFlow also provides a peephole connection with the Lstmcell, interested friends can directly see TensorFlow source code.
def __call__ (self, inputs, state, Scope=none): With
_checked_scope (self, scope or "Basic_lstm_cell", reuse=self._ Reuse):
if self._state_is_tuple:
c, h = State
else:
c, h = Array_ops.split (Value=state, Num_or_size_ splits=2, Axis=1)
concat = _linear ([inputs, H], 4 * self._num_units, True)
I, J, f, o = Array_ops.split (value=conc At, num_or_size_splits=4, Axis=1)
New_c = (c * sigmoid (f + self._forget_bias) + sigmoid (i) *
self._activation (j))
new_h = self._activation (new_c) * sigmoid (o)
if self._state_is_tuple:
new_state = Lstmstatetuple (new_ C, New_h)
else:
new_state = Array_ops.concat ([New_c, New_h], 1)
return new_h, new_state
In writing Convlstmcell, we can completely imitate the basiclstmcell of the wording, but here all the variable dimensions are to be increased, at the same time, compared to Figure 1 and Figure 3, whether the product or convolution, you can first splicing and then split the method to improve the efficiency of the operation. Below, I wrote a code for reference only, pro-test can be run.
Import TensorFlow as TF import NumPy as NP class Basicconvlstmcell (Tf.contrib.rnn.RNNCell): Def __init__ (self, shape, Num_filters, Kernel_size, forget_bias=1.0, Input_size=none, State_is_tuple=true, Activation=tf.nn.tanh, re Use=none): Self._shape = Shape Self._num_filters = num_filters self._kernel_size = kernel_size self._size = TF.
Tensorshape (shape+[self._num_filters]) Self._forget_bias = Forget_bias Self._state_is_tuple = state_is_tuple Self._activation = Activation Self._reuse = Reuse @property def state_size (self): return (tf.contrib.rnn.LSTMSt Atetuple (Self._size, self._size) if self._state_is_tuple else 2 * self._num_units) @property def output_s Ize (self): return self._size def __call__ (self, inputs, state, Scope=none): # We suppose inputs to be [time, Batch_size, Row, col, Channel] with tf.variable_scope (scope or "Basic_convlstm_cell", Reuse=self._reuse): If SEL
F._state_is_tuple: C, h = state Else:c, h = array_ops.split (Value=state, num_or_size_splits=2, axis=3) Inp_chan Nel = Inputs.get_shape (). As_list () [ -1]+self._num_filters out_channel = self._num_filters * 4 concat = Tf.conca
t ([Inputs, h], axis=3) kernel = tf.get_variable (' kernel ', Shape=self._kernel_size+[inp_channel, Out_channel]) Concat = tf.nn.conv2d (concat, Filter=kernel, strides= (1,1,1,1), padding= ' same ') I, J, f, o = Tf.split (Value=conca T, num_or_size_splits=4, axis=3) New_c = (c * tf.sigmoid (f + self._forget_bias) + tf.sigmoid (i) * SE Lf._activation (j)) New_h = Self._activation (new_c) * Tf.sigmoid (o) if self._state_is_tuple:new_state = Tf.contrib.rnn.LSTMStateTuple (New_c, new_h) else:new_state = Tf.concat ([New_c, New_h], 3) return NE W_h, new_state if __name__ = = ' __main__ ': Inputs=tf.placeholder (Tf.float32, [5,2,3,3,3]) cell = Basicconvlstmcell ([3, 3], 6, [3,3]) outputs, state = TF.NN.DYNAMIC_RNN (cell, inputs, Dtype=inputs.dtype, time_major=true) with TF. Session () as Sess:inp = Np.random.normal (size= (5,2,3,3,3)) Sess.run (Tf.global_variables_initializer ()) O, s = Sess.run ([Outputs, State], FEED_DICT={INPUTS:INP}) Print O.shape # (5,2,3,3,6)
. related GitHub projects:
Https://github.com/viorik/ConvLSTM
Https://github.com/carlthome/tensorflow-convlstm-cell