Mnist classification of Sesame Http:tensorflow LSTM

Last Update:2018-03-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This section describes the use of RNN LSTM to do the MNIST classification method, RNN compared to CNN, the speed may be slower, but can save more memory space.

Initialization first we can initialize some variables, such as the learning rate, the number of node units, the number of RNN layers, and so on:

Learning_rate = 1e-33  ten  = Tf.placeholder (Tf.float32, [])

Then you need to declare the MNIST data generator:

 as TF  from  = input_data.read_data_sets ('mnist_data/', one_hot=true)

Next, the general declaration of the input data, the input data is represented by X, the callout data is represented by Y_label:

784])

The X dimension entered here is [None, 784], which represents batch_size uncertainty, and the input dimension 784,y_label the same.

Next we need to reshape the input x, because we need to divide a graph into multiple time_step in order to build a RNN sequence, so the time_step is set to 28 directly, so Input_size becomes 28,b Atch_size unchanged, so the result of reshape is a three-dimensional matrix:

X_shape = Tf.reshape (x, [-1, Time_step, Input_size])

RNN layer Next we need to build a RNN model, here we use the RNN Cell is Lstmcell, and to build a three-layer RNN, so here also need to use Multirnncell, its input parameters are Lstmcell list.

So we can first declare a method for creating Lstmcell, as follows:

def Cell (num_units):     = Tf.nn.rnn_cell. Basiclstmcell (num_units=num_units)    return dropoutwrapper (cell, Output_keep_prob=keep_prob)

Dropout is also added to reduce overfitting during the training process.

Next we'll use it to build multi-layered RNN:

 for  in range (Num_layer)])

Note that the For loop is used here, and each cycle is reborn into a lstmcell instead of using multiplication directly to extend the list, because this causes Lstmcell to be the same object, causing the problem of dimension mismatches after the Multirnncell is built.

Next we need to declare an initial state:

H0 = Cells.zero_state (batch_size, Dtype=tf.float32)

The next call to the Dynamic_rnn () method will then be done to build the model:

Output, HS = TF.NN.DYNAMIC_RNN (cells, inputs=x_shape, initial_state=h0)

Here inputs input is x do reshape after the result, the initial state passed through Initial_state, its return result has two, an output is all the time_step outputs, the assignment value is a three-dimensional, the first dimension length equals Batch_size, the second-dimension length equals Time_step, and the third-dimensional length equals num_units. The other HS is the implied state, which is the tuple form, the length is the RNN layer 3, each element contains C and H, which is two implied states of LSTM.

In this case, the final result of the output can take the result of the last time_step, so you can use:

Output = output[:,-1,:]

Or just take the hidden state the last layer of H is the same:

h = hs[-1].h

In this model, the two are equivalent. Note, however, that if you are using text processing, you may be padding because of the length of the text, which makes them different.

Output layer Next we do a linear transformation and Softmax output results:

= TF. Variable (Tf.truncated_normal ([Num_units, Category_num], stddev=0.1), dtype== tf. Variable (Tf.constant (0.1, Shape=[category_num]), dtype== Tf.matmul (output, W) += Tf.nn.softmax_cross_entropy_with_logits (Labels=y_label, logits=y)

Here the Loss directly called the Softmax_cross_entropy_with_logits first calculates the Softmax, then calculates the cross entropy.

Training and evaluation finally redefine the training and evaluation process, and output Train accuracy and Test accuracy every step of the way during the training process:

# Traintrain= Tf.train.AdamOptimizer (learning_rate=learning_rate). Minimize (cross_entropy) # predictioncorrection_prediction= Tf.equal (Tf.argmax (Y, axis=1), Tf.argmax (Y_label, axis=1)) Accuracy=Tf.reduce_mean (Tf.cast (correction_prediction, Tf.float32)) # Trainwith TF. Session () asSess:sess.run (Tf.global_variables_initializer ()) forStepinchRange (Total_steps +1): batch_x, Batch_y= Mnist.train.next_batch ( -) Sess.run (train, Feed_dict={x:batch_x, Y_label:batch_y, Keep_prob:0.5, batch_size:batch_x.shape[0]}) # Train accuracyifStep% Steps_per_validate = =0: Print ('Train', step, Sess.run (accuracy, feed_dict={x:batch_x, y_label:batch_y, Keep_prob:0.5, batch_size:batch_x.shape[0]}) # Test accuracyifStep% Steps_per_test = =0: test_x, test_y=mnist.test.images, mnist.test.labels print ('Test', step, sess.run (accuracy, feed_dict={x:test_x, y_label:test_y, Keep_prob:1, batch_size:test_x.shape[0]}))

After running directly, only a few rounds of training can achieve a 98% accuracy rate:

Train0 0.27Test0 0.2223Train - 0.87Train $ 0.91Train - 0.94Train - 0.94Train - 0.99Test - 0.9595Train - 0.95Train the 0.97Train - 0.98

It can be seen that LSTM in the task of doing MNIST character classification is still relatively effective.

Mnist classification of Sesame Http:tensorflow LSTM

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Mnist classification of Sesame Http:tensorflow LSTM

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Mnist classification of Sesame Http:tensorflow LSTM

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support