Learn AI with Python! Multiplier! TensorFlow's Introductory article!

Source: Internet
Author: User
Tags new set

Mnist Data Set Introduction

Mnist is an entry-level computer vision dataset that contains a variety of handwritten digital pictures:

The Mnist dataset contains callout information, which represents 5, 0, 4, and 1, respectively.

The official website of the Mnist dataset is Yann LeCun ' s website

Automatic download

First posted on GitHub address: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/mnist

To create a new input_data.py file:

# Copyright TensorFlow Authors. All rights reserved.## Licensed under the Apache License, Version 2.0 (the "License"); n compliance with the license.# obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unl ESS required by applicable/agreed to in writing, software# distributed under the License are distributed on a "as I S "basis,# without warranties or CONDITIONS of any KIND, either express or implied.# see the License for the specific Lang Uage governing permissions and# limitations under the license.# ====================================================== ======================== "" "Functions for downloading and reading MNIST data." " From __future__ import absolute_importfrom __future__ import divisionfrom __future__ import Print_functionimport Gzipimport osimport tempfileimport numpyfrom six.moves import urllibfrom six.moves import xrange # Pylint:disable=redefin Ed-builtinimport TensorFlow as Tffrom tEnsorflow.contrib.learn.python.learn.datasets.mnist Import Read_data_sets 

File content train-images-idx3-ubyte.gz Training set picture-55000 training pictures, 5000 verified pictures train-labels-idx1-ubyte.gz training set picture corresponding digital label T10k-images-idx 3-ubyte.gz test Set picture-10000 pictures t10k-labels-idx1-ubyte.gz the corresponding digital label of the test set the data set is divided into two parts:

60000 rows of training datasets and 10000 rows of test data sets.

The array is expanded into a vector with a length of 28x28 = 784. How to expand this array (the order between the numbers) is unimportant, as long as the individual images are expanded in the same way.

In the Mnist training data set, Mnist.train.images is a tensor of [60000, 784] in shape,

The first dimension represents the index of the picture, and the second dimension represents the index of the pixel in the picture, with a value between 0 and 1.

Softmax regression model

We know that each picture of mnist represents a number, from 0 to 9. We want to get the probability that a given picture represents each number.

For example, our model might speculate that a picture containing 9 represents a 9 probability of 80%, but the probability of judging it to be 8 is 15% (because both 8 and 9 have the upper half of the circle), the probability that it is 6 is 5%, and then gives it a lower probability of representing other numbers.

Softmax regression is a simple model that is ideal for dealing with the probability distribution of a class of objects to be classified on multiple categories. So, this model is usually the last step of many advanced models.

Softmax regression is broadly divided into two steps:

Step 1:add up the evidence of our input being in certain classes; Step 2:convert that evidence into probabilities.

To obtain evidence of a given picture belonging to a particular number class (evidence), we weighted the sum of the pixel values of the picture.

If this pixel has strong evidence that this picture does not belong to the class, then the corresponding weights are negative, conversely if the pixel has favorable evidence to support this picture belongs to this class, then the weight value is positive.

The following is a visual example in which blue indicates positive values and red indicates negative values (the shape of the blue area tends to be a number shape):

We also need to add an extra offset (bias), because the input tends to have some extraneous interference amount. So for the given input picture X it represents the evidence of the number I can be expressed as:

If you formulate this process, you will get:

If you formulate this process, you will get:

can also be simplified into:

After reading, do not feel the unknown feeling Li? May be understood at the beginning, but after moving out the formula, is it puzzling? Yes, that's how it feels, so go on

Implementing regression Models

In order to perform scientific calculations in Python, we often use a number of standalone library function packages, such as numpy, to implement complex matrix computations. But because Python is not running fast enough, it is often implemented in some more efficient languages. However, doing so can lead to the overhead of language conversions, such as converting back to Python operations.

TensorFlow has done some optimizations in this area, so that the flow of interactive computations you describe is completely independent of Python, thus avoiding the overhead of language switching.

In order to use TensorFlow, we need to reference the library function:

Import TensorFlow as TF

Using some symbolic variables to describe the process of interactive computation, create the following:

x = Tf.placeholder (Tf.float32, [None, 784])

Description, the placeholder function can be understood as a formal parameter, which is used to define the process and assign a specific value at execution time;

The x here is not a specific value, but a placeholder, which is specified when needed.

We enter this value when TensorFlow runs the calculation. We want to be able to enter any number of mnist images, each drawing flattened into 784-dimensional vectors. We use 2-dimensional floating-point tensor to represent these graphs, the shape of this tensor is [none,784 (28*28)]. (none here means that the first dimension of this tensor can be any length.) )

Use variable (variables) to represent weights and biases in the model, which are variable. As follows:

W = tf. Variable (Tf.zeros ([784, ten])) B = tf. Variable (Tf.zeros ([10]))

Description, zeros is to create a tensor object with all parameters of 0;

Both W and B are initialized to a 0 value matrix. The dimension of W is 784 * 10 because we need to convert a 784-dimensional pixel value into a evidence value on 10 categories by multiplying the corresponding weighted value, and B is the cumulative offset value of the 10 categories.

Now, implementing the Softmax regression model requires only one line of code:

y = Tf.nn.softmax (Tf.matmul (x,w) + b)

Where the Matmul function implements the product of X and W, where X is a two-dimensional matrix, so put it in front. It can be seen that the implementation of the Softmax regression model in TensorFlow is very simple.

First, the logarithm of each element of Y is computed with tf.log.

Next, multiply each element of the y_ with the corresponding element of Tf.log (y).

Finally, the sum of all the elements of the tensor is calculated with tf.reduce_sum. (Note that the cross-entropy here is not just a single pair of predictions and real values, but the sum of the cross-entropy of all 100 images.) The performance of our model is better described for the prediction performance of 100 data points than for the performance of a single point. )

Now that we know what we need our model to do, it is very easy to train it with TensorFlow.

Because TensorFlow has a diagram that describes your individual computing units, it can automatically use the Reverse propagation algorithm (backpropagation algorithm) to effectively determine how your variable affects the cost value you want to minimize.

TensorFlow will then use your chosen optimization algorithm to continually modify the variables to reduce costs.

Train_step = Tf.train.GradientDescentOptimizer (0.01). Minimize (Cross_entropy)

Here, a gradient descent algorithm with a learning rate of 0.01 is used to minimize the cost function. Gradient descent is a simple method of calculation, even if the value of the variable is changed in the direction of decreasing the value of the cost function. TensorFlow also provides a number of other optimization algorithms that can be called with just one line of code.

What TensorFlow is actually doing here is that it will add a new set of computational operating units in the background to the graph describing your calculations to implement the inverse propagation algorithm and the gradient descent algorithm. Then it returns to you just a single operation, when running this operation, it uses the gradient descent algorithm to train your model, fine-tune your variables, and constantly reduce costs.

Before the model is trained, all parameters need to be initialized:

init = Tf.initialize_all_variables ()

You can run the model in a session and initialize it:

Sess = tf. Session () Sess.run (init)

Next, the model is trained, where the loop is trained 1000 times:

For I in range (£): Batch_xs, Batch_ys = Mnist.train.next_batch (+) Sess.run (Train_step, Feed_dict={x:batch_xs, Y_: BA Tch_ys})

In each cycle, we take 100 random data from the training data, which becomes batch processing.

Then, each time you run Train_step, the previously selected data is populated into the placeholder that you set as the input to the model.

Using a small amount of random data to train is called random training (stochastic training)-here is more specifically a random gradient descent training.

Ideally, we want to use all of our data for every step of the training, because it gives us better training results, but obviously it requires a lot of computational overhead.

So, every time we train we can use different subsets of data, which can reduce computational overhead and maximize the overall nature of the dataset.

Evaluation of the Model

So, how do we evaluate our model?

The first thing to do is to identify the right tags. Tf.argmax can give an index value where the maximum value of a tensor object is located on a dimension.

Since the label vectors are composed of 0 and 1, the index position of the maximum value 1 is the category label, for example Tf.argmax (y,1) returns the label value that the model predicts for any input x, and Tf.argmax (y_,1) represents the correct label, which we can use tf.equal To detect whether our predictions are true tag matching (the index position is the same as the match).

Correct_prediction = Tf.equal (Tf.argmax (y,1), Tf.argmax (y_,1))

Correct_prediction is a list of Boolean values, such as [True, False, True, true].

You can use the Tf.cast () function to convert it to [1, 0, 1, 1] to facilitate accurate calculation (the accuracy rate is 0.75).

accuracy = Tf.reduce_mean (Tf.cast (correct_prediction, "float"))

Finally, we get the accuracy rate of the model on the test set,

Print (Sess.run (accuracy, feed_dict={x:mnist.test.images, Y_: Mnist.test.labels}))

Softmax regression model Because the model is relatively simple, so the accuracy rate on the test set is about 91%, this result is not very good.

Through some simple optimization, the accuracy rate can reach 97%, the best model of the current accuracy rate of 99.7%. (Here are the results of many models running on the Mnist dataset).

Source

The complete code for handwriting recognition using the Softmax model is as follows:

Improved code annotations for better understanding of reading, while increasing the output training process:

Import Input_dataimport TensorFlow as Tffrom tensorflow.examples.tutorials.mnist import input_datamnist = Input_ Data.read_data_sets ("data/", one_hot=true) print ("Download done!") # Set the weight weights and bias biases as the optimization variable, the initial value is set to 0weights = TF. Variable (Tf.zeros ([784, ten])) biases = tf. Variable (Tf.zeros ([10]) # Build Model X = Tf.placeholder ("float", [None, 784]) # model's predictive value y = Tf.nn.softmax (Tf.matmul (x, weights) + b  iases) # True Value Y_real = Tf.placeholder ("float", [None, 10]) # The cross-entropy of the predicted value and the real value cross_entropy =-tf.reduce_sum (Y_real * tf.log (y)) # Use the gradient drop optimizer to minimize cross-entropy Train_step = Tf.train.GradientDescentOptimizer (0.01). Minimize (cross_entropy) # Compare predicted and true values consistently correct _prediction = Tf.equal (Tf.argmax (y, 1), Tf.arg_max (Y_real, 1)) # Statistics predict the correct number, take the mean value to get the accuracy rate accuracy = Tf.reduce_mean (Tf.cast ( Correct_prediction, "float")) # start Training init = Tf.initialize_all_variables () Sess = tf. Session () Sess.run (init) for I in Range (5000): # randomly selects 100 data for training, i.e. the so-called random gradient descent batch_xs, Batch_ys = Mnist.train.next_batch ( 100) # formally executes Train_step, replacing placeholder Sess.run with feed_dict data (Train_step, Feed_dict={x:batch_xs, y_real:batch_ys}) if I% 100 = = 0: # 100 times per training after evaluating model print ("Step" + str (i) + ", Training A Ccuracy "+ str (sess.run (accuracy, feed_dict={x:mnist.test.images, y_real:mnist.test.labels})) print (" Accuarcy on Test-dataset: ", sess.run (accuracy, feed_dict={x:mnist.test.images, y_real:mnist.test.labels}))

The result of the final execution:

Multiple executions, you will find that each time is different, but the basic is in 91%

If you increase the number of training times to 10W, you will find that the success rate is 98%

Training process:

Well, TensorFlow's Hello World is like this, spent a lot of time to introduce Softmax model, especially the formula that block to see not understand, but feel, mathematics in machine learning is really important ~ Summary:

This article uses the Softmax regression model to train the Mnist data set, mainly around the principle of Softmax model (see article) and how to use the model in TensorFlow, more important things, from this model learning from the design ideas ~

Incoming group: 125240963 to get 10 sets of pdf!

Learn AI with Python! Multiplier! TensorFlow's Introductory article!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.