Mnist classification of Sesame Http:tensorflow LSTM

Source: Internet
Author: User

This section describes the use of RNN LSTM to do the MNIST classification method, RNN compared to CNN, the speed may be slower, but can save more memory space.

Initialization first we can initialize some variables, such as the learning rate, the number of node units, the number of RNN layers, and so on:

Learning_rate = 1e-33  ten  = Tf.placeholder (Tf.float32, [])

Then you need to declare the MNIST data generator:

 as TF  from  = input_data.read_data_sets ('mnist_data/', one_hot=true)

Next, the general declaration of the input data, the input data is represented by X, the callout data is represented by Y_label:

784])

The X dimension entered here is [None, 784], which represents batch_size uncertainty, and the input dimension 784,y_label the same.

Next we need to reshape the input x, because we need to divide a graph into multiple time_step in order to build a RNN sequence, so the time_step is set to 28 directly, so Input_size becomes 28,b Atch_size unchanged, so the result of reshape is a three-dimensional matrix:

X_shape = Tf.reshape (x, [-1, Time_step, Input_size])

RNN layer Next we need to build a RNN model, here we use the RNN Cell is Lstmcell, and to build a three-layer RNN, so here also need to use Multirnncell, its input parameters are Lstmcell list.

So we can first declare a method for creating Lstmcell, as follows:

def Cell (num_units):     = Tf.nn.rnn_cell. Basiclstmcell (num_units=num_units)    return dropoutwrapper (cell, Output_keep_prob=keep_prob)

Dropout is also added to reduce overfitting during the training process.

Next we'll use it to build multi-layered RNN:

 for  in range (Num_layer)])

Note that the For loop is used here, and each cycle is reborn into a lstmcell instead of using multiplication directly to extend the list, because this causes Lstmcell to be the same object, causing the problem of dimension mismatches after the Multirnncell is built.

Next we need to declare an initial state:

H0 = Cells.zero_state (batch_size, Dtype=tf.float32)

The next call to the Dynamic_rnn () method will then be done to build the model:

Output, HS = TF.NN.DYNAMIC_RNN (cells, inputs=x_shape, initial_state=h0)

Here inputs input is x do reshape after the result, the initial state passed through Initial_state, its return result has two, an output is all the time_step outputs, the assignment value is a three-dimensional, the first dimension length equals Batch_size, the second-dimension length equals Time_step, and the third-dimensional length equals num_units. The other HS is the implied state, which is the tuple form, the length is the RNN layer 3, each element contains C and H, which is two implied states of LSTM.

In this case, the final result of the output can take the result of the last time_step, so you can use:

Output = output[:,-1,:]

Or just take the hidden state the last layer of H is the same:

h = hs[-1].h

In this model, the two are equivalent. Note, however, that if you are using text processing, you may be padding because of the length of the text, which makes them different.

Output layer Next we do a linear transformation and Softmax output results:

= TF. Variable (Tf.truncated_normal ([Num_units, Category_num], stddev=0.1), dtype== tf. Variable (Tf.constant (0.1, Shape=[category_num]), dtype== Tf.matmul (output, W) += Tf.nn.softmax_cross_entropy_with_logits (Labels=y_label, logits=y)

Here the Loss directly called the Softmax_cross_entropy_with_logits first calculates the Softmax, then calculates the cross entropy.

Training and evaluation finally redefine the training and evaluation process, and output Train accuracy and Test accuracy every step of the way during the training process:

# Traintrain= Tf.train.AdamOptimizer (learning_rate=learning_rate). Minimize (cross_entropy) # predictioncorrection_prediction= Tf.equal (Tf.argmax (Y, axis=1), Tf.argmax (Y_label, axis=1)) Accuracy=Tf.reduce_mean (Tf.cast (correction_prediction, Tf.float32)) # Trainwith TF. Session () asSess:sess.run (Tf.global_variables_initializer ()) forStepinchRange (Total_steps +1): batch_x, Batch_y= Mnist.train.next_batch ( -) Sess.run (train, Feed_dict={x:batch_x, Y_label:batch_y, Keep_prob:0.5, batch_size:batch_x.shape[0]}) # Train accuracyifStep% Steps_per_validate = =0: Print ('Train', step, Sess.run (accuracy, feed_dict={x:batch_x, y_label:batch_y, Keep_prob:0.5, batch_size:batch_x.shape[0]}) # Test accuracyifStep% Steps_per_test = =0: test_x, test_y=mnist.test.images, mnist.test.labels print ('Test', step, sess.run (accuracy, feed_dict={x:test_x, y_label:test_y, Keep_prob:1, batch_size:test_x.shape[0]}))

After running directly, only a few rounds of training can achieve a 98% accuracy rate:

Train0 0.27Test0 0.2223Train - 0.87Train $ 0.91Train - 0.94Train - 0.94Train - 0.99Test - 0.9595Train - 0.95Train the 0.97Train - 0.98

It can be seen that LSTM in the task of doing MNIST character classification is still relatively effective.

Mnist classification of Sesame Http:tensorflow LSTM

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.