Introduction to TensorFlow (V) Multilayer lstm Easy to understand version _

Introduction to TensorFlow (V) Multilayer lstm Easy to understand version __lstm

Last Update:2018-08-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

@author: Huangyongye
@creat_date: 2017-03-09

Preface: According to my own learning TensorFlow realize lstm experience, found that although there are many tutorials on the internet, many of which are based on the official examples, using multi-layer lstm to achieve Ptbmodel language model, such as:
TensorFlow notes: Multi-layer LSTM code Analysis
But the feeling of these examples is still too complex, so here is a relatively simple version, although not elegant, but it is relatively easy to understand.

If you want to understand the principle of lstm (if you already understand the principle of ordinary RNN), you can refer to my previous translation of the blog:
Understanding Lstm Network (Understanding Lstm Networks by Colah)

If you want to know the RNN principle, you can refer to the AK blog:
The unreasonable effectiveness of recurrent neural Networks

Many friends mentioned multilayer how to understand, so they made a schematic, hoping to help beginners better understanding of multilayer rnn.

Figure 1 3-layer RNN by Time step

This example does not speak the principle. Through this example, you can understand the implementation of Single-layer lstm, multi-layer lstm. Enter the format of the output data. The realization of RNN's dropout layer.

#-*-Coding:utf-8-*-
import tensorflow as tf
import NumPy as NP from
tensorflow.contrib import rnn
from Tensorflow.examples.tutorials.mnist Import input_data

# set GPU on-demand growth
config = tf. Configproto ()
config.gpu_options.allow_growth = True
sess = tf. Session (Config=config)

# First import the data, look at the form of the data
mnist = input_data.read_data_sets (' Mnist_data ', one_hot=true)
Print Mnist.train.images.shape

1 2 3 4 5 6 7 8 9 10 11 12 13-14

Extracting mnist_data/train-images-idx3-ubyte.gz
extracting mnist_data/train-labels-idx1-ubyte.gz
Extracting mnist_data/t10k-images-idx3-ubyte.gz
extracting mnist_data/t10k-labels-idx1-ubyte.gz
(55000, 784)

1 2 3 4 5 6 1. Set up all the parameters used in the model first

LR = 1e-3
# In training and testing, we want to use different batch_size. So the way of
batch_size = Tf.placeholder (tf.int32)  # Note type must be Tf.int32
# batch_size = 128

# The input feature for each moment is 28-D, which is to enter one row at a time, one row has 28 pixels input_size = The duration of the time
series is 28, that is, every time a prediction is made, Need to enter 28 lines
timestep_size =
# Number of nodes per hidden layer
hidden_size = 256
# lstm layer layer
layer_num = 2
# last Lost The number of classified categories, if it is regression prediction should be 1
class_num = ten

_x = Tf.placeholder (Tf.float32, [None, 784])
y = Tf.placeholder ( Tf.float32, [None, Class_num])
Keep_prob = Tf.placeholder (Tf.float32)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2. Start to build the LSTM model, in fact, the common Rnns model is the same

# Restore 784-point character information to 28 * 28 Pictures # The following steps are key to achieving rnn/lstm ################################################################## # # * * * Step 1:rnn input shape = (batch_size, timestep_size, input_size) X = Tf.reshape (_x, [-1, 28, 28]) # * * * Step 2: Define a layer of lstm_cel L, only need to explain hidden_size, it will automatically match the dimension of the input X Lstm_cell = rnn. Basiclstmcell (Num_units=hidden_size, forget_bias=1.0, state_is_tuple=true) # * * * Step 3: Add dropout layer, general only set Output_keep _prob Lstm_cell = rnn.
Dropoutwrapper (Cell=lstm_cell, input_keep_prob=1.0, Output_keep_prob=keep_prob) # * * Step 4: Call Multirnncell to implement multi-tier lstm Mlstm_cell = Rnn. Multirnncell ([Lstm_cell] * layer_num, state_is_tuple=true) # * * * Step 5: Initialize state zero = Init_state with full mlstm_cell.zero_state ( Batch_size, Dtype=tf.float32) # * * Step 6: Method One, call DYNAMIC_RNN () to let us build a good network run # * * * * when time_major==false, outputs.shape = [ba Tch_size, Timestep_size, hidden_size] # * * So, can take h_state = outputs[:,-1,:] As the last output # * * State.shape = [Layer_num, 2, BA Tch_size, Hidden_size], # * * Or, can take h_state = state[-1][1] As the last output #* * The final output dimension is [Batch_size, hidden_size] # outputs, state = Tf.nn.dynamic_rnn (Mlstm_cell, Inputs=x, initial_state=init_state , time_major=false) # h_state = outputs[:,-1,:] # or h_state = state[-1][1] # *************** in order to better understand the working principle of lstm, we put the above
The function in step 6 implements *************** # by looking at the document you will find that Rnncell provides a __call__ (see the last attached) function that we can use to expand the implementation lstm iteration by time. # * * Step 6: Method two, outputs = list () state = Init_state with Tf.variable_scope (' RNN ') by Time step: for Timestep in range (timestep _size): If Timestep > 0:tf.get_variable_scope (). Reuse_variables () # The state here preserves every layer of lstm. states = Mlstm_cell (x[:, Timestep,:], State) Outputs.append (cell_output) h_state = output S[-1]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 3. Set up loss function and optimizer, start training and complete testThe following sections are in fact the same as those previously written in the TensorFlow (iii) Multi-layer CNNS implementation Mnist classification.

# The output of the Lstm section above will be a [hidden_size] tensor, and we need to sort it with a Softmax layer # first define the connection weight matrix of Softmax and offset # out_w = Tf.placeholder (tf.f Loat32, [Hidden_size, Class_num], name= ' out_weights ') # Out_bias = Tf.placeholder (Tf.float32, [Class_num], name= ' Out_ Bias ') # Start training and testing W = tf. Variable (Tf.truncated_normal ([Hidden_size, Class_num], stddev=0.1), dtype=tf.float32) bias = tf. Variable (Tf.constant (0.1,shape=[class_num]), dtype=tf.float32) Y_pre = Tf.nn.softmax (Tf.matmul (h_state, W) + bias) # loss

and evaluation function cross_entropy =-tf.reduce_mean (Y * tf.log (y_pre)) Train_op = Tf.train.AdamOptimizer (LR). Minimize (Cross_entropy) Correct_prediction = Tf.equal (Tf.argmax (y_pre,1), Tf.argmax (y,1)) accuracy = Tf.reduce_mean (Tf.cast (correct_ Prediction, "float") Sess.run (Tf.global_variables_initializer ()) for I in range: _batch_size = 128 Batch
            = Mnist.train.next_batch (_batch_size) if (i+1)%200 = = 0:train_accuracy = Sess.run (accuracy, feed_dict={ _x:batch[0], y:batch[1], keep_prob:1.0, Batch_size: _batch_size}) # Number of epoch completed by iteration: mnist.train.epochs_completed print ' iter%d, step%d, trainin  G accuracy%g "% (mnist.train.epochs_completed, (i+1), train_accuracy) Sess.run (Train_op, feed_dict={_x:batch[0], y: Batch[1], keep_prob:0.5, batch_size: _batch_size}) # Calculate the accuracy of test data print "Test accuracy%g"% sess.run (accuracy, feed_dict=
 {_x:mnist.test.images, y:mnist.test.labels, keep_prob:1.0, Batch_size:mnist.test.images.shape[0]}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33

Iter0, step, training accuracy 0.851562
Iter0, step, training accuracy-0.960938 Iter1
, step, training Accuracy 0.984375
Iter1, step A, training accuracy 0.960938 Iter2
, step 1000, training accuracy, 0.984375
It Er2, step 1200, training accuracy 0.9375 Iter3, step 1400, training-accuracy 0.96875 Iter3
, step 1600, training AC Curacy 0.984375
Iter4, step 1800, training accuracy 0.992188
-Iter4, step, training, accuracy 0.984375 tes
T accuracy 0.9858

1 2 3 4 5 6 7 8 9 10 11-12

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Introduction to TensorFlow (V) Multilayer lstm Easy to understand version __lstm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Introduction to TensorFlow (V) Multilayer lstm Easy to understand version __lstm

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support