Use tensorflow to build CNN and tensorflow to build cnn
Convolutional Neural Networks Convolutional Neural Network (CNN) transfers the data of an image to CNN. The original coating is composed of RGB, And then CNN thickened the thickness and the length and width become smaller, each layer is stretched to form a classifier.
There are several important concepts in CNN:
- Stride
- Padding
- Pooling
Stride is the number of steps to extract information. Each piece of information is extracted, and the length and width are reduced, but the thickness is increased. Combine the extracted small blocks into a compressed cube.
Padding can be extracted in two ways. One is the length and width reduction after extraction, and the other is the same as the width and width after extraction.
Pooling means that when the cross-step process is relatively large, it will miss some important information. To solve this problem, it adds a layer called pooling to store the necessary information in advance, then it becomes the compressed layer.
Using tensorflow to build CNN, that is, convolutional neural networks, is a very simple task. I use the MNIST handwritten number recognition in the official tutorial as an example to show the code. The entire program is basically consistent with the official routine, however, some machine learning or convolutional neural networks should be able to quickly understand the meaning of the Code.
# Encoding = UTF-8 import tensorflow as tf import numpy as np from tensorflow. examples. tutorials. mnist import input_data mnist = input_data.read_data_sets ('mnist _ data', one_hot = True) def weight_variable (shape): initial = tf. truncated_normal (shape, stddev = 0.1) # truncates the normal distribution. This function prototype is dimensional, mean, and standard deviation return tf. variable (initial) def bias_variable (shape): initial = tf. constant (0.1, shape = shape) return tf. variable (initial) def conv2d (x, W): return tf. nn. conv2d (x, W, strides = [0th, 3rd], padding = 'same') # Set strides bits and to 1, the rest is the horizontal and vertical step def max_pool_2x2 (x) of Convolution: return tf. nn. max_pool (x, ksize = [1, 2, 1], strides = [1, 2, 1], padding = 'same') # SAME as the parameter, ksize is the size of the pooled block x = tf. placeholder ("float", shape = [None, 784]) y _ = tf. placeholder ("float", shape = [None, 10]) # convert an image to a four-dimensional tensor. The first parameter indicates the number of samples, and-1 indicates an indefinite number, the third parameter represents the image size, and the last parameter represents the number of image channels x_image = tf. reshape (x, [-,]) # The first layer of convolution pooling w_conv1 = weight_variable ([5, 5,]) # The second parameter is worth the size of the convolution kernel, that is, patch, the third parameter is the number of image channels, and the fourth parameter is the number of convolution kernels, which indicates the number of convolution features B _conv1 = bias_variable ([32]) h_conv1 = tf. nn. relu (conv2d (x_image, w_conv1) + B _conv1) h_pool1 = max_pool_2x2 (h_conv1) # Layer 2 convolution pooling w_conv2 = weight_variable ([5, 32, 64]) # multi-channel convolution, convolution of 64 features B _conv2 = bias_variable ([64]) h_conv2 = tf. nn. relu (conv2d (h_pool1, w_conv2) + B _conv2) h_pool2 = max_pool_2x2 (h_conv2) # original image size 28*28, first round image reduced to 14*14, a total of 32 images, after the second round, the image is reduced to 7*7, with 64 w_fc1 = weight_variable ([7*7*64,1024]) B _fc1 = bias_variable ([1024]) h_pool2_flat = tf. reshape (h_pool2, [-* 7*7*64]) # expand. The first parameter is the number of samples.-1 unknown f_fc1 = tf. nn. relu (tf. matmul (h_pool2_flat, w_fc1) + B _fc1) # dropout operation to reduce overfitting keep_prob = tf. placeholder (tf. float32) h_fc1_drop = tf. nn. dropout (f_fc1, keep_prob) w_fc2 = weight_variable ([1024,10]) B _fc2 = bias_variable ([10]) y_conv = tf. nn. softmax (tf. matmul (h_fc1_drop, w_fc2) + B _fc2) cross_entropy =-tf. reduce_sum (y _ * tf. log (y_conv) # defines the cross entropy as the loss function train_step = tf. train. adamOptimizer (1e-4 ). minimize (cross_entropy) # Call the optimizer to optimize correct_prediction = tf. equal (tf. argmax (y_conv, 1), tf. argmax (y _, 1) accuracy = tf. performance_mean (tf. cast (correct_prediction, "float") sess = tf. interactiveSession () sess. run (tf. initialize_all_variables () for I in range (2000): batch = mnist. train. next_batch (50) if I % 100 = 0: train_accuracy = accuracy. eval (feed_dict = {x: batch [0], y _: batch [1], keep_prob: 1.0}) print "step % d, training accuracy % g" % (I, train_accuracy) train_step.run (feed_dict = {x: batch [0], y _: batch [1], keep_prob: 0.5}) print "test accuracy % g" % accuracy. eval (feed_dict = {x: mnist. test. images [0: 500], y _: mnist. test. labels [], keep_prob: 1.0 })
Pay attention to the following points in the program:
1. dimension problems because tensorflow is based on the concept of tensor, tensor is actually a matrix of dimensional expansion, So dimensions are particularly important and dimensions are easily confusing.
2. convolution. The convolution kernel is not only two-dimensional, but also three-dimensional during multi-channel convolution.
3. During the final test, if all the verification sets are loaded at one time, the memory may burst. Because the cloud server is used, the memory may be smaller, if the memory is enough, you can load all the results directly.
4. The number of iterations of the original version of the program is set to 20000. This number of iterations takes about several hours of training (without GPU) and can be changed as required.
The above is all the content of this article. I hope it will be helpful for your learning and support for helping customers.