TensorFlow implements the Softmax regression model, tensorflowsoftmax
I. Overview and complete code
Tensorflow encapsulates MNIST (MixedNational Institute of Standard and Technology database), a very simple machine vision dataset, and can directly load MNIST data into the expected format. this program uses Softmax Regression to train the classification model for Handwritten Digit Recognition.
First look at the complete code:
Import tensorflow as tf from tensorflow. examples. tutorials. mnist import input_data mnist = input_data.read_data_sets ("MNIST_data", one_hot = True) print (mnist. train. images. shape, mnist. train. labels. shape) print (mnist. test. images. shape, mnist. test. labels. shape) print (mnist. validation. images. shape, mnist. validation. labels. shape) # Build a computing image x = tf. placeholder (tf. float32, [None, 784]) W = tf. variable (tf. zeros ([784, 10]) B = tf. variable (tf. zeros ([10]) y = tf. nn. softmax (tf. matmul (x, W) + B) y _ = tf. placeholder (tf. float32, [None, 10]) cross_entropy = tf. performance_mean (-tf. reduce_sum (y _ * tf. log (y), reduction_indices = [1]) train_step = tf. train. gradientDescentOptimizer (0.5 ). minimize (cross_entropy) # Start figure sess = tf in session sess. interactiveSession () # create an InteractiveSession object tf. global_variables_initializer (). run () # global parameter initialize for I in range (1000): batch_xs, batch_ys = mnist. train. next_batch (100) train_step.run ({x: batch_xs, y _: batch_ys }) # test and verification phase # Take the index of the maximum values of y and y _ along the 1st axis and determine whether it is equal to correct_prediction = tf. equal (tf. argmax (y, 1), tf. argmax (y _, 1) # convert bool-type tensor to float32-type tensor and calculate the average to obtain the correct rate of accuracy = tf. performance_mean (tf. cast (correct_prediction, tf. float32) print (accuracy. eval ({x: mnist. test. images, y _: mnist. test. labels }))
Ii. Detailed explanation
First, let's take a look at the core steps of using TensorFlow for algorithm design training.
1. Define the algorithm formula, that is, the calculation when the neural network forward;
2. Define the loss, select the optimizer, and specify the optimizer to optimize the loss;
3. Iterative Training Algorithm Model on the training set;
4. Evaluate the accuracy of the trained model in the test set or verification set.
Create a Placeholder, where the input tensor data is located. The first parameter is the data type dtype, and the second parameter is the tensor shape. next, create the weights (W) and biases (B) Variable objects in the SoftmaxRegression model. Different from the tensor used for data storage, variable exists persistently in Model Training iterations, and is updated in each iteration. Variable initialization can be a constant or a random value. next we will implement the model algorithm y = softmax (Wx + B). The TensorFlow language only needs a line of code, tf. nn contains a large number of neural network components, header tf. matmul is a matrix multiplication function. tensorFlow automatically implements the forward and backward content in the model. As long as the loss is defined, the system automatically evaluates and drops the gradient during training to complete automatic learning of model parameters. define the loss function lossfunction to describe the classification accuracy. cross-entropy cross entropy is usually used for multiclass classification problems. first define a placeholder and enter the actual label, tf. performance_sum and tf. the functions of performance_mean are summation and averaging. after constructing the loss function cross-entropy, define an optimization algorithm to start training. we use the random gradient descent SGD. After the definition, TensorFlow will automatically add many operation operations to achieve reverse propagation and gradient descent. What we provide is an encapsulated optimizer, you only need to feed the data during each iteration. set the learning rate.
The graph can be started only after the construction phase is complete. the first step of the startup graph is to create a Session object or InteractiveSession object. If no parameters are created, the Session constructor starts the default graph. when an InteractiveSession object is created, the Session is registered as the default Session, and subsequent operations run in this Session by default. Data and operations between different sessions should be independent of each other. next, use the TensorFlow global parameter initiator tf. global_variables_initializer runs its run method directly (this global parameter initializer should be a new feature in version 1.0.0 and cannot pass the test in version 0.10.0 ).
At this point, all the formulas defined above are actually Computation Graph. After the code is executed, the Computation has not actually occurred. The Computation is only executed when the run method is called and the data is fed.
Next, you can start to iteratively execute the training operation train_step. Here, 100 samples are randomly extracted from the training set every time to form a mini-batch and feed it to placeholder.
After completing iterative training, you can verify the accuracy of the model. compare the indexes of the maximum values of y and y _ in each test sample, convert them to the float32 type tensor, and calculate the average value to get the correct rate. after multiple tests, the correct rate on the test set is about 92%. it is still an ideal result.
Iii. Other supplements
1. Sesssion class and InteractiveSession class
For product = tf. matmul (matrix1, matrix2), calls the 'run () 'method of sess to execute the matrix multiplication op, and passes in 'product' as the parameter of this method. as mentioned above, 'product' represents the output of matrix multiplication op. It indicates to the method that we want to retrieve the output of matrix multiplication op. the entire execution process is automated, and sessions are responsible for transferring all input required by op. op is usually executed concurrently. the function call 'run (product) 'triggers the execution of three op (two constants op and one matrix multiplication op) in the figure. the returned 'result' is a numpy 'ndarray' object.
After the Session object is used up, it needs to be closed to release the resource sess. close (). In addition to explicitly calling close, you can also use the "with" code block to automatically close the object.
with tf.Session() as sess: result = sess.run([product]) print result
To facilitate the use of Python interaction environments such as IPython, InteractiveSession can be used to replace the Session class and Tensor. eval () and Operation. the run () method replaces the Session. run (). this avoids the use of a variable to hold sessions.
# Enter an interactive TensorFlow session. import tensorflow as tf sess = tf. interactiveSession () x = tf. variable ([1.0, 2.0]) a = tf. constant ([3.0, 3.0]) # Use the run () method of the initializer op to initialize 'X' x. initializer. run () # Add a subtraction sub op and subtract 'A' from 'x '. run the subtraction op and the output result is sub = tf. sub (x, a) print sub. eval () # => [-2. -1.]
2. tf. performance_sum
First, tf. performance_x is a series of operations (operation) that implement mathematical computing for a tensor to reduce various dimensions.
Tf. performance_sum (input_tensor, reduction_indices = None, keep_dims = False, name = None)
Operation Function: reduce the input_tensor dimension along the specified redu_indices dimension. Unless keep_dims = True, the tensor rank is reduced by 1 on redu_indices, and the length of the reduced dimension is 1. if ction_indices does not input a parameter and all dimensions are reduced, a tensor containing only one element is returned. the operation returns the tensor after dimensionality reduction.
DEMO code:
# 'x' is [[1, 1, 1] # [1, 1, 1]] tf.reduce_sum(x) ==> 6 tf.reduce_sum(x, 0) ==> [2, 2, 2] tf.reduce_sum(x, 1) ==> [3, 3] tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]] tf.reduce_sum(x, [0, 1]) ==> 6
3. tf. performance_mean
Tf. performance_mean (input_tensor, reduction_indices = None, keep_dims = False, name = None)
Operation Function: reduce input_tensor by redu_indices dimension in the given dimension. Unless keep_dims = True, the tensor rank is reduced by 1 on ction_indices, And the dimension length is reduced by 1. if ction_indices does not input a parameter and all dimensions are reduced, a tensor containing only one element is returned. the operation returns the tensor after dimensionality reduction.
DEMO code:
# 'x' is [[1., 1. ] # [2., 2.]] tf.reduce_mean(x) ==> 1.5 tf.reduce_mean(x, 0) ==> [1.5, 1.5] tf.reduce_mean(x, 1) ==> [1., 2.]
4. tf. argmax
Tf. argmax (input, dimension, name = None)
Operation Function: returns the index of the maximum value of input in the specified dimension. The return type is int64.
The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.