Define Cell
In a lot of RNN paper we see similar graphs:
Each of these small rectangles represents a cell. Each cell is a slightly more complex structure, as shown in the following diagram:
The context in the diagram is a cell structure, and you can see that it accepts input (T), context (t-1), and then outputs output (t), such as the Rnn cell, which we use to stack up in our task, That is, the current layer of the cell output also as the next layer of input, so can be introduced into each cell input and output shape is the same. If you enter the shape= (none, N), plus the context (T-1) as the input part (the input becomes shape= (none, 2n), you know the shape= of W (2n, N).
Said so much, in fact I just want to express a point, is
Don't underestimate that small cell, it is not only 1 neuron unit, but n hidden units
Therefore, we note that one of the parameters to be provided when defining a cell (BASICRNNCELL/BASICLSTMCELL/GRUCELL/RNNCELL/LSTMCELL) structure in TensorFlow is Hidden_units_ Size (1) Basiclstmcell
Inherits From:rnncell Aliases: class Tf.contrib.rnn.BasicLSTMCell class Tf.nn.rnn_cell. Basiclstmcell
Basic lstm Recurrent network cell.
We add Forget_bias (default:1) to the biases of the forget gate in order to reduce the scale of forgetting Ng of the training.
It does not allow cell clipping, a projection layer, and does to not use Peep-hole Connections:it is the basic baseline.
For advanced models, please use the full Tf.nn.rnn_cell. Lstmcell that follows.
Initialization: __init__
__init__ (
num_units,
forget_bias=1.0,
state_is_tuple=true,
activation=none,
reuse=none
)
Initialize the basic lstm cell. Args: num_units: int, the number of units in the Lstm cell. forget_bias: Float, the bias added to forget gates (for the above). Must set to 0.0 manually is restoring from cudnnlstm-trained checkpoints. state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and M_state. If False, they are concatenated along the column axis. The latter behavior would soon be deprecated. activation: activation function of the inner states. Default:tanh. Reuse: (optional) Python Boolean describing whether to reuse variables in a existing scope. If not True, and the existing scope already has the given variables, a error is raised. (2) What is State_is_tuple=true
As you can see, each lstm cell produces two internal state ct C T c_{t} at t time
and HT H T h_{t}, these two states must be recorded in the TensorFlow, remember this is good to understand.
If state_is_tuple=true, then the state CT C t c_{t} and HT h T h_{t} mentioned above are separate records, placed in a tuple, if this parameter is not set or set to false, the two states are joined by columns, and become [batch , 2n] (n is the number of hidden units) returned. Officials say this form is about to be deprecated, all of us adding state_is_tuple=true when using lstm
Because of the TensorFlow version upgrade, State_is_tuple=true will become default in later versions. For lstm, state can be divided into (c_state, h_state). (3) What is Forget_bias
The Forget_bias (default value is 1) to the forgotten gate bias, in order to reduce the size of the forgotten at the beginning of training. Tf.nn.static_rnn aliases: tf.contrib.rnn.static_rnn TF.NN.STATIC_RNN
STATIC_RNN (
cell,
inputs,
Initial_state=none,
dtype=none,
sequence_length=none,
scope= None
)
Creates a recurrent neural network specified by Rnncell cell.
The simplest form of RNN network generated is:
state = Cell.zero_state (...)
outputs = []
for input_ in inputs:
output, state = cell (Input_, State)
outputs.append (output) return
( Outputs, state)
Args:
Cell: an instance of Rnncell.
Inputs: A length T list of inputs, each a Tensor of shape [Batch_size, input_size], or A nested tuple of such elements.
initial_state: (optional) A initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [Batch_size, Cell.state_size]. If Cell.state_size is a tuple, this should to be a tuple of tensors has shapes [batch_size, S] for s in Cell.state_size.
Dtype: (optional) The data type for the initial state and expected output. Required If Initial_state is isn't provided or RNN state has a heterogeneous.
Sequence_length: Specifies the length of each sequence in inputs. A int32 or Int64 vector (tensor) size [batch_size], values in [0, T).
Scope: Variablescope for the created subgraph; Defaults to "RNN".
Returns:
A pair (outputs, state) Where:outputs was a length T list of outputs (one for each input), or a nested tuple of such Nts. State are the final state Tf.nn.dynamic_rnn
DYNAMIC_RNN (
cell,
inputs,
Sequence_length=none,
initial_state=none,
Dtype=none,
Parallel_iterations=none,
Swap_memory=false,
time_major=false,
scope=none
)
Args:
Cell: an instance of Rnncell.
Inputs: the RNN inputs. If Time_major = = False (default), this is must be a Tensor of shape:[batch_size, Max_time, ...], or a nested tuple of such El Ements. If Time_major = = True, this must is a Tensor of shape: [Max_time, Batch_size, ...], or a nested tuple of such elements. This may also is a (possibly nested) tuple of tensors satisfying this property. The two dimensions must match across all inputs, but otherwise the ranks and other shape components may differ. In this case, input to cell at each time-step would replicate the structure of these tuples, except for the time dimension (from which, the time is taken). The input to cell in each time step would be a Tensor or (possibly nested) tuple of tensors all with dimensions [Batch_siz E, ...].
Sequence_length: (optional) an int32/int64 vector sized [batch_size]. Used to Copy-through state and zero-out outputs when past a batch element ' s sequence length. So it's more for correctness than performance.
initial_state: (optional) A initial state for the RNN. If cell.state_size is an integer, this must be a Tensor of appropriate type and shape [Batch_size, Cell.state_size]. If Cell.state_size is a tuple, this should to be a tuple of tensors has shapes [batch_size, S] for s in Cell.state_size.
Dtype: (optional) The data type for the initial state and expected output. Required If Initial_state is isn't provided or RNN state has a heterogeneous.
parallel_iterations: (default:32). The number of iterations to run in parallel. Those operations which does not have any temporal dependency and can is run in parallel, would be. This is parameter trades off to space. Values >> 1 Use the memory but take less time with smaller values use less memory but computations take.
swap_memory: Transparently swap the tensors produced in forward inference but for back needed from GPU to CPU. This allows training Rnns which would typically is not fit on a single GPU with very minimal (or no) performance.
Time_major: The shape format of the inputs and outputs tensors. If true, these tensors must are shaped [Max_time, batch_size, depth]. If false, these tensors must are shaped [batch_size, max_time, depth]. Using Time_major = True is a bit more efficient because it avoids transposes in the beginning and end of RNN On. However, most TensorFlow data is batch-major, so by default this function accepts input and emits output in batch-major fo Rm.
Scope: Variablescope for the created subgraph; Defaults to "RNN".
Returns:
A pair (outputs, state) Where:
outputs: the RNN output Tensor.
If Time_major = = False (default), this'll be a Tensor shaped: [Batch_size, Max_time, Cell.output_size].
If Time_major = = True, this'll be a Tensor shaped: [Max_time, Batch_size, Cell.output_size].
Note, if Cell.output_size is a (possibly nested) tuple of integers or tensorshape objects, then outputs would be a tuple ha Ving Tsun same structure as cell.output_size, containing tensors has shapes corresponding to the shape data in CELL.OUTP Ut_size.
State: The final state. If Cell.state_size is a int, this'll be shaped [Batch_size, cell.state_size]. If It is a tensorshape, this'll be shaped [batch_size] + cell.state_size. If It is a (possibly nested) tuple of ints or tensorshape, this'll be a tuple has the corresponding shapes. The If cells are lstmcells state'll be a tuple containing a lstmstatetuple to each cell.
If you use TF.NN.DYNAMIC_RNN (cell, inputs), we want to determine the inputs format. The time_major parameters in Tf.nn.dynamic_rnn have different values for different inputs formats. If inputs is (batches, steps, inputs) ==> time_major=false; If inputs is (steps, batches, inputs) ==> time_major=true; initial_state
# Use Basic lstm Cell.
Lstm_cell = Tf.contrib.rnn.BasicLSTMCell (N_hidden_units, forget_bias=1.0, state_is_tuple=true)
# Initialize all 0 state
zero_state
Zero_state (
batch_size,
dtype
)
Return zero-filled state tensor (s). Args: batch_size: int, float, or unit Tensor representing the batch size. Dtype: The data type to use for the state. Returns:
If state_size is a int or tensorshape, then the return value is a n-d tensor of shape [batch_size x state_size] filled WI Th zeros.
If State_size is a nested list or tuple, then the return value is a nested list or tuple (of the same structure) of 2-d Te Nsors with the shapes [batch_size x s] for each s in State_size. the difference between STATIC_RNN and Dynamic_rnn
batch_size = 128
n_inputs = # mnist data input (img shape:28*28)
n_steps = # time steps
N_hidde n_units = 128 # Neurons in hidden layer n_classes
= Ten # mnist classes (0-9 digits)
Input is different:
STATIC_RNN input as list, a list of ' n_steps ' tensors of shape (batch_size, N_input)
Dynamic_rnn input as Tensor,a tensor of shape (batch_size, n_steps, N_input) or (N_steps, batch_size, N_input)
Input is different:
STATIC_RNN output is list, a list of ' n_steps ' tensors of shape (batch_size, n_hidden_units)
Dynamic_rnn input as Tensor,a tensor of shape (batch_size, n_steps, n_hidden_units) or (N_steps, batch_size, N_hidden_units)
Dynamic_rnn can represent multiple layers
Dynamic-rnn can allow different batch sequence length to be different, but RNN cannot.
Sample Static_rnn
"" "Recurrent neural network.
A Recurrent Neural Network (LSTM) Implementation example using TensorFlow Library. This example is using the Mnist database of handwritten digits (http://yann.lecun.com/exdb/mnist/) Links: [Long Term Memory] (http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf) [Mnist Dataset] (http://yann.lecun.com/
exdb/mnist/). Author:aymeric Damien project:https://github.com/aymericdamien/tensorflow-examples/"" "from __future__ import Print_ Function Import TensorFlow as TF from tensorflow.contrib import RNN # import mnist data from Tensorflow.examples.tutoria Ls.mnist import Input_data mnist = Input_data.read_data_sets ("/tmp/data/", one_hot=true) ' to classify images using a R Ecurrent Neural Network, we consider every image row as a sequence of pixels.
Because Mnist image shape is 28*28px, we'll then handle sequences to steps for every sample. "# training Parameters learning_rate = 0.001 Training_steps = 10000 Batch_size = 128 diSplay_step = # network Parameters num_input = # mnist Data input (img shape:28*28) timesteps = # timesteps num _hidden = 128 # Hidden layer num of features num_classes = # mnist Total classes (0-9 digits) # tf Graph input X = tf. Placeholder ("float", [None, Timesteps, num_input]) Y = Tf.placeholder ("float", [None, Num_classes]) # Define weights Weig HTS = {' Out ': TF. Variable (Tf.random_normal ([Num_hidden, num_classes])} biases = {' Out ': TF. Variable (Tf.random_normal ([num_classes])} def RNN (x, Weights, biases): # Prepare Data shape to match ' RNN ' functi On requirements # Current Data input shape: (batch_size, Timesteps, n_input) # Required shape: ' timesteps ' tensors
List of shape (Batch_size, n_input) # Unstack to get a list of ' timesteps ' tensors of shape (batch_size, N_input) x = Tf.unstack (x, Timesteps, 1) # Another to Tranpose (Batch_size, Timesteps, N_input) # to a list of ' t Imesteps ' tensors of shape (batch_size,N_input) # Timesteps and Batch_size are exchanged, the matrix becomes [timesteps,batch_size,n_input] # x = tf.transpose (x, [1, 0, 2]) # and then becomes (Timesteps*batch_size, N_input), # This step can be linearly transformed into (Timesteps*batch_size, num_hidden) # x = Tf.reshape (x, [-1, N_inpu T]) # split into Timesteps (Batch_size, n_input) # x = Tf.split (0, Timesteps, x) # Define a lstm cell with Tensorflo W Lstm_cell = Rnn. Basiclstmcell (Num_hidden, forget_bias=1.0) # Initial state init_state = Lstm_cell.zero_state (Batch_size, DTYPE=TF . float32) # get lstm cell output outputs, states = Rnn.static_rnn (Lstm_cell, X, Initial_state=init_state, dtype=t F.FLOAT32) # Linear activation, using RNN inner loop last output return Tf.matmul (outputs[-1], weights[' out ']) + biases[' out '] logits = RNN (X, weights, biases) prediction = Tf.nn.softmax (logits) # Define loss and Optimizer loss_op = Tf.reduce_mean (Tf.nn.softmax_cross_entropy_with_logits (Logits=logits, labels=y)) optimizer = Tf.train.GradientDescentOpTimizer (learning_rate=learning_rate) train_op = Optimizer.minimize (loss_op) # Evaluate model (with the test logits, for Dropo UT to is disabled) correct_pred = tf.equal (Tf.argmax (prediction, 1), Tf.argmax (Y, 1)) accuracy = Tf.reduce_mean (Tf.cast (CO Rrect_pred, Tf.float32)) # Initialize the variables (i.e. assign their default value) init = Tf.global_variables_initiali Zer () # Start training with TF. Session () as Sess: # Run the initializer Sess.run (init) for step in range (1, training_steps+1): BATC h_x, batch_y = Mnist.train.next_batch (batch_size) # reshape data to get seq-elements batch_x = ba Tch_x.reshape ((Batch_size, Timesteps, Num_input)) # Run optimization op (backprop) Sess.run (train_op, Feed _dict={x:batch_x, y:batch_y}) if step% Display_step = 0 or step = 1: # Calculate Batch Loss and a
Ccuracy loss, acc = Sess.run ([Loss_op, accuracy], feed_dict={x:batch_x, Y:batch_y}) Print ("Step" + str (STEP) + ", Minibatch loss=" + \ ' {:. 4f} '. Format (loss) + ", training accuracy=" + \ "{:. 3f}". Format (ACC)) print ("Optimiza
tion finished! ") # Calculate Accuracy for 128 mnist test Images Test_len = 128 Test_data = Mnist.test.images[:test_len].reshape (-1 , Timesteps, num_input)) Test_label = Mnist.test.labels[:test_len] Print ("Testing accuracy:", \ Sess.run (A Ccuracy, Feed_dict={x:test_data, Y:test_label}))
Dynamic_rnn
# Create a Basicrnncell
Rnn_cell = Tf.nn.rnn_cell. Basicrnncell (hidden_size)
# ' outputs ' is a tensor of shape [Batch_size, Max_time, Cell_state_size]
# defining INI Tial state
initial_state = Rnn_cell.zero_state (batch_size, Dtype=tf.float32)
# "state" is a tensor of shape [BATC H_size, Cell_state_size]
outputs, state = Tf.nn.dynamic_rnn (Rnn_cell, Input_data,
Initial_state=initial_ State,
Dtype=tf.float32)
# create 2 lstmcells
rnn_layers = [Tf.nn.rnn_cell. Lstmcell (size) for size in [