TensorFlow implements the Softmax regression model, tensorflowsoftmax

Source: Internet
Author: User

TensorFlow implements the Softmax regression model, tensorflowsoftmax

I. Overview and complete code

Tensorflow encapsulates MNIST (MixedNational Institute of Standard and Technology database), a very simple machine vision dataset, and can directly load MNIST data into the expected format. this program uses Softmax Regression to train the classification model for Handwritten Digit Recognition.

First look at the complete code:

Import tensorflow as tf from tensorflow. examples. tutorials. mnist import input_data mnist = input_data.read_data_sets ("MNIST_data", one_hot = True) print (mnist. train. images. shape, mnist. train. labels. shape) print (mnist. test. images. shape, mnist. test. labels. shape) print (mnist. validation. images. shape, mnist. validation. labels. shape) # Build a computing image x = tf. placeholder (tf. float32, [None, 784]) W = tf. variable (tf. zeros ([784, 10]) B = tf. variable (tf. zeros ([10]) y = tf. nn. softmax (tf. matmul (x, W) + B) y _ = tf. placeholder (tf. float32, [None, 10]) cross_entropy = tf. performance_mean (-tf. reduce_sum (y _ * tf. log (y), reduction_indices = [1]) train_step = tf. train. gradientDescentOptimizer (0.5 ). minimize (cross_entropy) # Start figure sess = tf in session sess. interactiveSession () # create an InteractiveSession object tf. global_variables_initializer (). run () # global parameter initialize for I in range (1000): batch_xs, batch_ys = mnist. train. next_batch (100) train_step.run ({x: batch_xs, y _: batch_ys }) # test and verification phase # Take the index of the maximum values of y and y _ along the 1st axis and determine whether it is equal to correct_prediction = tf. equal (tf. argmax (y, 1), tf. argmax (y _, 1) # convert bool-type tensor to float32-type tensor and calculate the average to obtain the correct rate of accuracy = tf. performance_mean (tf. cast (correct_prediction, tf. float32) print (accuracy. eval ({x: mnist. test. images, y _: mnist. test. labels }))

Ii. Detailed explanation

First, let's take a look at the core steps of using TensorFlow for algorithm design training.

1. Define the algorithm formula, that is, the calculation when the neural network forward;

2. Define the loss, select the optimizer, and specify the optimizer to optimize the loss;

3. Iterative Training Algorithm Model on the training set;

4. Evaluate the accuracy of the trained model in the test set or verification set.

Create a Placeholder, where the input tensor data is located. The first parameter is the data type dtype, and the second parameter is the tensor shape. next, create the weights (W) and biases (B) Variable objects in the SoftmaxRegression model. Different from the tensor used for data storage, variable exists persistently in Model Training iterations, and is updated in each iteration. Variable initialization can be a constant or a random value. next we will implement the model algorithm y = softmax (Wx + B). The TensorFlow language only needs a line of code, tf. nn contains a large number of neural network components, header tf. matmul is a matrix multiplication function. tensorFlow automatically implements the forward and backward content in the model. As long as the loss is defined, the system automatically evaluates and drops the gradient during training to complete automatic learning of model parameters. define the loss function lossfunction to describe the classification accuracy. cross-entropy cross entropy is usually used for multiclass classification problems. first define a placeholder and enter the actual label, tf. performance_sum and tf. the functions of performance_mean are summation and averaging. after constructing the loss function cross-entropy, define an optimization algorithm to start training. we use the random gradient descent SGD. After the definition, TensorFlow will automatically add many operation operations to achieve reverse propagation and gradient descent. What we provide is an encapsulated optimizer, you only need to feed the data during each iteration. set the learning rate.

The graph can be started only after the construction phase is complete. the first step of the startup graph is to create a Session object or InteractiveSession object. If no parameters are created, the Session constructor starts the default graph. when an InteractiveSession object is created, the Session is registered as the default Session, and subsequent operations run in this Session by default. Data and operations between different sessions should be independent of each other. next, use the TensorFlow global parameter initiator tf. global_variables_initializer runs its run method directly (this global parameter initializer should be a new feature in version 1.0.0 and cannot pass the test in version 0.10.0 ).

At this point, all the formulas defined above are actually Computation Graph. After the code is executed, the Computation has not actually occurred. The Computation is only executed when the run method is called and the data is fed.

Next, you can start to iteratively execute the training operation train_step. Here, 100 samples are randomly extracted from the training set every time to form a mini-batch and feed it to placeholder.

After completing iterative training, you can verify the accuracy of the model. compare the indexes of the maximum values of y and y _ in each test sample, convert them to the float32 type tensor, and calculate the average value to get the correct rate. after multiple tests, the correct rate on the test set is about 92%. it is still an ideal result.

Iii. Other supplements

1. Sesssion class and InteractiveSession class

For product = tf. matmul (matrix1, matrix2), calls the 'run () 'method of sess to execute the matrix multiplication op, and passes in 'product' as the parameter of this method. as mentioned above, 'product' represents the output of matrix multiplication op. It indicates to the method that we want to retrieve the output of matrix multiplication op. the entire execution process is automated, and sessions are responsible for transferring all input required by op. op is usually executed concurrently. the function call 'run (product) 'triggers the execution of three op (two constants op and one matrix multiplication op) in the figure. the returned 'result' is a numpy 'ndarray' object.

After the Session object is used up, it needs to be closed to release the resource sess. close (). In addition to explicitly calling close, you can also use the "with" code block to automatically close the object.

with tf.Session() as sess:  result = sess.run([product])  print result 

To facilitate the use of Python interaction environments such as IPython, InteractiveSession can be used to replace the Session class and Tensor. eval () and Operation. the run () method replaces the Session. run (). this avoids the use of a variable to hold sessions.

# Enter an interactive TensorFlow session. import tensorflow as tf sess = tf. interactiveSession () x = tf. variable ([1.0, 2.0]) a = tf. constant ([3.0, 3.0]) # Use the run () method of the initializer op to initialize 'X' x. initializer. run () # Add a subtraction sub op and subtract 'A' from 'x '. run the subtraction op and the output result is sub = tf. sub (x, a) print sub. eval () # => [-2. -1.]

2. tf. performance_sum

First, tf. performance_x is a series of operations (operation) that implement mathematical computing for a tensor to reduce various dimensions.

Tf. performance_sum (input_tensor, reduction_indices = None, keep_dims = False, name = None)

Operation Function: reduce the input_tensor dimension along the specified redu_indices dimension. Unless keep_dims = True, the tensor rank is reduced by 1 on redu_indices, and the length of the reduced dimension is 1. if ction_indices does not input a parameter and all dimensions are reduced, a tensor containing only one element is returned. the operation returns the tensor after dimensionality reduction.

DEMO code:

# 'x' is [[1, 1, 1] #   [1, 1, 1]] tf.reduce_sum(x) ==> 6 tf.reduce_sum(x, 0) ==> [2, 2, 2] tf.reduce_sum(x, 1) ==> [3, 3] tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]] tf.reduce_sum(x, [0, 1]) ==> 6 

3. tf. performance_mean

Tf. performance_mean (input_tensor, reduction_indices = None, keep_dims = False, name = None)

Operation Function: reduce input_tensor by redu_indices dimension in the given dimension. Unless keep_dims = True, the tensor rank is reduced by 1 on ction_indices, And the dimension length is reduced by 1. if ction_indices does not input a parameter and all dimensions are reduced, a tensor containing only one element is returned. the operation returns the tensor after dimensionality reduction.

DEMO code:

# 'x' is [[1., 1. ] #   [2., 2.]] tf.reduce_mean(x) ==> 1.5 tf.reduce_mean(x, 0) ==> [1.5, 1.5] tf.reduce_mean(x, 1) ==> [1., 2.] 

4. tf. argmax

Tf. argmax (input, dimension, name = None)

Operation Function: returns the index of the maximum value of input in the specified dimension. The return type is int64.

The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.