Tensorflow recognizes handwritten numbers and tensorflow recognizes handwritten numbers.

Source: Internet
Author: User

Tensorflow recognizes handwritten numbers and tensorflow recognizes handwritten numbers.

Tensorflow, as an open-source google Project, has now surpassed caffe and seems to be the most popular deep learning framework. It is true that the real existence of the Code is better felt when writing. This is different from caffe. caffe uses the configuration file to generate the network. The tensorflow environment is version 0.10. Note that statements in other versions may be incorrect. This is a compatibility issue between tensorflow versions.

You also need to install PIL: pip install Pillow

Image Format:

-Image standardization, which can be installed in a 20*20 pixel frame while retaining its aspect ratio.
-Images are concentrated in a 28 × 28 image.
-Sort pixels by column. The pixel value ranges from 0 to 255, indicating the background (white) and indicating the foreground (black ).

Create a .png file. The background is white and the handwritten font is black,

Below is the data test code, a two-layer convolutional neural network, and then save the model with save.

# Coding: UTF-8 import tensorflow as tf import numpy as np import matplotlib. pyplot as plt import input_data ''' to obtain the data ''' mnist = input_data.read_data_sets ("MNIST_data/", one_hot = True) training = mnist. train. images trainlable = mnist. train. labels testing = mnist. test. images testlabel = mnist. test. labels print ("MNIST loaded") # obtain the interactive mode sess = tf. interactiveSession () # initialization variable x = tf. placeholder ("float", shape = [None, 784]) y _ = tf. placeholder ("float", shape = [None, 10]) W = tf. variable (tf. zeros ([784, 10]) B = tf. variable (tf. zeros ([10]) ''' to generate a weighting function. shape is the data shape ''' def weight_variable (shape): initial = tf. truncated_normal (shape, stddev = 0.1) return tf. variable (initial) ''' generates paranoid items. The shape is the data shape ''' def bias_variable (shape): initial = tf. constant (0.1, shape = shape) return tf. variable (initial) def conv2d (x, W): return tf. nn. conv2d (x, W, strides = [1, 1, 1], padding = 'same') def max_pool_2x2 (x): return tf. nn. max_pool (x, ksize = [1, 2, 2, 1], strides = [1, 2, 2, 1], padding = 'same ') w_conv1 = weight_variable ([5, 5, 1, 32]) B _conv1 = bias_variable ([32]) x_image = tf. reshape (x, [-1, 28, 28, 1]) h_conv1 = tf. nn. relu (conv2d (x_image, W_conv1) + B _conv1) h_pool1 = random (h_conv1) W_conv2 = weight_variable ([5, 5, 32, 64]) B _conv2 = bias_variable ([64]) h_conv2 = tf. nn. relu (conv2d (h_pool1, W_conv2) + B _conv2) h_pool2 = reverse (h_conv2) W_fc1 = weight_variable ([7*7*64,102 4]) B _fc1 = bias_variable ([1024]) h_pool2_flat = tf. reshape (h_pool2, [-1, 7*7*64]) h_fc1 = tf. nn. relu (tf. matmul (h_pool2_flat, W_fc1) + B _fc1) keep_prob = tf. placeholder ("float") h_fc1_drop = tf. nn. dropout (h_fc1, keep_prob) W_fc2 = weight_variable ([1024, 10]) B _fc2 = bias_variable ([10]) y_conv = tf. nn. softmax (tf. matmul (h_fc1_drop, W_fc2) + B _fc2) cross_entropy =-tf. reduce_sum (y _ * tf. log (y_conv) train_step = tf. train. adamOptimizer (1e-4 ). minimize (cross_entropy) correct_prediction = tf. equal (tf. argmax (y_conv, 1), tf. argmax (y _, 1) accuracy = tf. performance_mean (tf. cast (correct_prediction, "float") # Save the network training parameter saver = tf. train. saver () sess. run (tf. initialize_all_variables () for I in range (8000): batch = mnist. train. next_batch (50) if I % 100 = 0: train_accuracy = accuracy. eval (feed_dict = {x: batch [0], y _: batch [1], keep_prob: 1.0}) print "step % d, training accuracy % g" % (I, train_accuracy) train_step.run (feed_dict = {x: batch [0], y _: batch [1], keep_prob: 0.5}) save_path = saver. save (sess, "model_mnist.ckpt") print ("Model saved in life:", save_path) print "test accuracy % g" % accuracy. eval (feed_dict = {x: mnist. test. images, y _: mnist. test. labels, keep_prob: 1.0 })

Input_data.py: the following code downloads the mnist Dataset: the code is the official download version provided by the mnist dataset.

# Copyright 2015 Google Inc. All Rights Reserved. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # #   http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # ============================================================================== """Functions for downloading and reading MNIST data.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import gzip import os import tensorflow.python.platform import numpy from six.moves import urllib from six.moves import xrange # pylint: disable=redefined-builtin import tensorflow as tf SOURCE_URL = 'http://yann.lecun.com/exdb/mnist/' def maybe_download(filename, work_directory):  """Download the data from Yann's website, unless it's already here."""  if not os.path.exists(work_directory):   os.mkdir(work_directory)  filepath = os.path.join(work_directory, filename)  if not os.path.exists(filepath):   filepath, _ = urllib.request.urlretrieve(SOURCE_URL + filename, filepath)   statinfo = os.stat(filepath)   print('Successfully downloaded', filename, statinfo.st_size, 'bytes.')  return filepath def _read32(bytestream):  dt = numpy.dtype(numpy.uint32).newbyteorder('>')  return numpy.frombuffer(bytestream.read(4), dtype=dt)[0] def extract_images(filename):  """Extract the images into a 4D uint8 numpy array [index, y, x, depth]."""  print('Extracting', filename)  with gzip.open(filename) as bytestream:   magic = _read32(bytestream)   if magic != 2051:    raise ValueError(      'Invalid magic number %d in MNIST image file: %s' %      (magic, filename))   num_images = _read32(bytestream)   rows = _read32(bytestream)   cols = _read32(bytestream)   buf = bytestream.read(rows * cols * num_images)   data = numpy.frombuffer(buf, dtype=numpy.uint8)   data = data.reshape(num_images, rows, cols, 1)   return data def dense_to_one_hot(labels_dense, num_classes=10):  """Convert class labels from scalars to one-hot vectors."""  num_labels = labels_dense.shape[0]  index_offset = numpy.arange(num_labels) * num_classes  labels_one_hot = numpy.zeros((num_labels, num_classes))  labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1  return labels_one_hot def extract_labels(filename, one_hot=False):  """Extract the labels into a 1D uint8 numpy array [index]."""  print('Extracting', filename)  with gzip.open(filename) as bytestream:   magic = _read32(bytestream)   if magic != 2049:    raise ValueError(      'Invalid magic number %d in MNIST label file: %s' %      (magic, filename))   num_items = _read32(bytestream)   buf = bytestream.read(num_items)   labels = numpy.frombuffer(buf, dtype=numpy.uint8)   if one_hot:    return dense_to_one_hot(labels)   return labels class DataSet(object):  def __init__(self, images, labels, fake_data=False, one_hot=False,         dtype=tf.float32):   """Construct a DataSet.   one_hot arg is used only if fake_data is true. `dtype` can be either   `uint8` to leave the input as `[0, 255]`, or `float32` to rescale into   `[0, 1]`.   """   dtype = tf.as_dtype(dtype).base_dtype   if dtype not in (tf.uint8, tf.float32):    raise TypeError('Invalid image dtype %r, expected uint8 or float32' %            dtype)   if fake_data:    self._num_examples = 10000    self.one_hot = one_hot   else:    assert images.shape[0] == labels.shape[0], (      'images.shape: %s labels.shape: %s' % (images.shape,                          labels.shape))    self._num_examples = images.shape[0]    # Convert shape from [num examples, rows, columns, depth]    # to [num examples, rows*columns] (assuming depth == 1)    assert images.shape[3] == 1    images = images.reshape(images.shape[0],                images.shape[1] * images.shape[2])    if dtype == tf.float32:     # Convert from [0, 255] -> [0.0, 1.0].     images = images.astype(numpy.float32)     images = numpy.multiply(images, 1.0 / 255.0)   self._images = images   self._labels = labels   self._epochs_completed = 0   self._index_in_epoch = 0  @property  def images(self):   return self._images  @property  def labels(self):   return self._labels  @property  def num_examples(self):   return self._num_examples  @property  def epochs_completed(self):   return self._epochs_completed  def next_batch(self, batch_size, fake_data=False):   """Return the next `batch_size` examples from this data set."""   if fake_data:    fake_image = [1] * 784    if self.one_hot:     fake_label = [1] + [0] * 9    else:     fake_label = 0    return [fake_image for _ in xrange(batch_size)], [      fake_label for _ in xrange(batch_size)]   start = self._index_in_epoch   self._index_in_epoch += batch_size   if self._index_in_epoch > self._num_examples:    # Finished epoch    self._epochs_completed += 1    # Shuffle the data    perm = numpy.arange(self._num_examples)    numpy.random.shuffle(perm)    self._images = self._images[perm]    self._labels = self._labels[perm]    # Start next epoch    start = 0    self._index_in_epoch = batch_size    assert batch_size <= self._num_examples   end = self._index_in_epoch   return self._images[start:end], self._labels[start:end] def read_data_sets(train_dir, fake_data=False, one_hot=False, dtype=tf.float32):  class DataSets(object):   pass  data_sets = DataSets()  if fake_data:   def fake():    return DataSet([], [], fake_data=True, one_hot=one_hot, dtype=dtype)   data_sets.train = fake()   data_sets.validation = fake()   data_sets.test = fake()   return data_sets  TRAIN_IMAGES = 'train-images-idx3-ubyte.gz'  TRAIN_LABELS = 'train-labels-idx1-ubyte.gz'  TEST_IMAGES = 't10k-images-idx3-ubyte.gz'  TEST_LABELS = 't10k-labels-idx1-ubyte.gz'  VALIDATION_SIZE = 5000  local_file = maybe_download(TRAIN_IMAGES, train_dir)  train_images = extract_images(local_file)  local_file = maybe_download(TRAIN_LABELS, train_dir)  train_labels = extract_labels(local_file, one_hot=one_hot)  local_file = maybe_download(TEST_IMAGES, train_dir)  test_images = extract_images(local_file)  local_file = maybe_download(TEST_LABELS, train_dir)  test_labels = extract_labels(local_file, one_hot=one_hot)  validation_images = train_images[:VALIDATION_SIZE]  validation_labels = train_labels[:VALIDATION_SIZE]  train_images = train_images[VALIDATION_SIZE:]  train_labels = train_labels[VALIDATION_SIZE:]  data_sets.train = DataSet(train_images, train_labels, dtype=dtype)  data_sets.validation = DataSet(validation_images, validation_labels,                  dtype=dtype)  data_sets.test = DataSet(test_images, test_labels, dtype=dtype)  return data_sets 

Then perform the code test:

# import modules import sys import tensorflow as tf from PIL import Image, ImageFilter   def predictint(imvalue):   """   This function returns the predicted integer.   The imput is the pixel values from the imageprepare() function.   """    # Define the model (same as when creating the model file)   x = tf.placeholder(tf.float32, [None, 784])   W = tf.Variable(tf.zeros([784, 10]))   b = tf.Variable(tf.zeros([10]))    def weight_variable(shape):     initial = tf.truncated_normal(shape, stddev=0.1)     return tf.Variable(initial)    def bias_variable(shape):     initial = tf.constant(0.1, shape=shape)     return tf.Variable(initial)    def conv2d(x, W):     return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')    def max_pool_2x2(x):     return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')    W_conv1 = weight_variable([5, 5, 1, 32])   b_conv1 = bias_variable([32])    x_image = tf.reshape(x, [-1, 28, 28, 1])   h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)   h_pool1 = max_pool_2x2(h_conv1)    W_conv2 = weight_variable([5, 5, 32, 64])   b_conv2 = bias_variable([64])    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)   h_pool2 = max_pool_2x2(h_conv2)    W_fc1 = weight_variable([7 * 7 * 64, 1024])   b_fc1 = bias_variable([1024])    h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])   h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)    keep_prob = tf.placeholder(tf.float32)   h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)    W_fc2 = weight_variable([1024, 10])   b_fc2 = bias_variable([10])    y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)    init_op = tf.initialize_all_variables()   saver = tf.train.Saver()    """   Load the model_mnist.ckpt file   file is stored in the same directory as this python script is started   Use the model to predict the integer. Integer is returend as list.   Based on the documentatoin at   https://www.tensorflow.org/versions/master/how_tos/variables/index.html   """   with tf.Session() as sess:     sess.run(init_op)     saver.restore(sess, "model_mnist.ckpt")     # print ("Model restored.")      prediction = tf.argmax(y_conv, 1)     return prediction.eval(feed_dict={x: [imvalue], keep_prob: 1.0}, session=sess)   def imageprepare(argv):   """   This function returns the pixel values.   The imput is a png file location.   """   im = Image.open(argv).convert('L')   width = float(im.size[0])   height = float(im.size[1])   newImage = Image.new('L', (28, 28), (255)) # creates white canvas of 28x28 pixels    if width > height: # check which dimension is bigger     # Width is bigger. Width becomes 20 pixels.     nheight = int(round((20.0 / width * height), 0)) # resize height according to ratio width     if (nheight == 0): # rare case but minimum is 1 pixel       nheigth = 1       # resize and sharpen     img = im.resize((20, nheight), Image.ANTIALIAS).filter(ImageFilter.SHARPEN)     wtop = int(round(((28 - nheight) / 2), 0)) # caculate horizontal pozition     newImage.paste(img, (4, wtop)) # paste resized image on white canvas   else:     # Height is bigger. Heigth becomes 20 pixels.     nwidth = int(round((20.0 / height * width), 0)) # resize width according to ratio height     if (nwidth == 0): # rare case but minimum is 1 pixel       nwidth = 1       # resize and sharpen     img = im.resize((nwidth, 20), Image.ANTIALIAS).filter(ImageFilter.SHARPEN)     wleft = int(round(((28 - nwidth) / 2), 0)) # caculate vertical pozition     newImage.paste(img, (wleft, 4)) # paste resized image on white canvas    # newImage.save("sample.png")    tv = list(newImage.getdata()) # get pixel values    # normalize pixels to 0 and 1. 0 is pure white, 1 is pure black.   tva = [(255 - x) * 1.0 / 255.0 for x in tv]   return tva   # print(tva)   def main(argv):   """   Main function.   """   imvalue = imageprepare(argv)   predint = predictint(imvalue)   print (predint[0]) # first value in list   if __name__ == "__main__":   main('2.png') 

The code I used for testing is as follows:


You can save the image to another path and test it.

(1) load an image of my Handwritten digits.
(2) convert the image to black and white (mode "L ")
(3) determine that the size of the original image is the largest
(4) Adjust the image size so that the maximum size (ether height and width) is 20 pixels and the scale is minimized in the same proportion.
(5) Sharpen the image. This will greatly enhance the results.
(6) paste the image on a 28x28 pixel white canvas. The image is centered from the top or the side at a maximum size of 4 pixels. The maximum size is always 20 pixels and 4 + 20 + 4 = 28. The minimum size is located at half the difference between 28 and the new size of the scaled image.
(7) obtain the pixel value of the new image (canvas + center image.
(8) Normalize a value between the pixel value 0 and 1 (this is also done in the TensorFlow MNIST tutorial ). 0 is white, and 1 is pure black. The pixel value obtained from step 7 is the opposite, where 255 is white and 0 is black, so the value must be reversed. The following formula includes inversion and normalization (255-X) * 1.0/255.0

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.