tf.nn.conv2d
The function is: given a 4-dimensional input and filter, a 2-dimensional convolution result is computed. The function is defined as:
def conv2d (input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=none, Name=none ):
The first several parameters are input, filter, strides, padding, Use_cudnn_on_gpu, ... Here are one by one explanations.
input: data to be convolution. The format requirement is one tensor,[batch, In_height, In_width, In_channels].
Indicates the number of batches, image height, width, input channels, respectively.
filter: Convolution core. The format requirements are [Filter_height, Filter_width, In_channels, Out_channels].
Indicates the height, width, number of input channels, and number of output channels of convolutional cores, respectively.
strides : A list with a length of 4. Represents the distance that the convolution window slides in input after each convolution
padding : There are two options for same and valid, indicating whether you want to preserve the part of the edge of the image that is not fully convolution. If it is same, then keep
Use_cudnn_on_gpu : Whether to use CUDNN acceleration. The default is True
Tf.nn.max_pool
The maximum pooling operation is performed, while the Avg_pool is the average pooling operation. The function is defined as:
def max_pool (value, ksize, strides, padding, data_format="nhwc", Name=none):
value: a 4D tensor in the Format [batch, height, width, channels], as in the input format in conv2d
ksize: A list of 4, indicating the size of the pooled window
strides: The sliding value of the pooled window, as in conv2d
padding: Same as the usage in conv2d.
#Copyright TensorFlow Authors. All rights Reserved.##Licensed under the Apache License, Version 2.0 (the "License");#You are not a use of this file except in compliance with the License.#obtain a copy of the License at##http://www.apache.org/licenses/LICENSE-2.0##unless required by applicable law or agreed to writing, software#distributed under the License is distributed on a "as is" BASIS,#without warranties or CONDITIONS of any KIND, either express or implied.#See the License for the specific language governing permissions and#limitations under the License.# =============================================================================="""A very simple MNIST classifier. See extensive documentation athttp://tensorflow.org/tutorials/mnist/beginners/index.md""" from __future__ ImportAbsolute_import from __future__ ImportDivision from __future__ Importprint_function#Import Data fromTensorflow.examples.tutorials.mnistImportInput_dataImportTensorFlow as Tfflags=Tf.app.flagsFLAGS=flags. Flagsflags.define_string ('Data_dir','/tmp/data/','Directory for storing data')#The first launch will download the text files and put them under the/tmp/data folder.Print(flags.data_dir) mnist= Input_data.read_data_sets (Flags.data_dir, one_hot=True)defweight_variable (Shape): initial= Tf.truncated_normal (Shape, stddev=0.1)#the initial value of a variable is a truncated positive distribution returnTF. Variable (initial)defbias_variable (Shape): initial= Tf.constant (0.1, shape=shape)returnTF. Variable (initial)defconv2d (x, W):"""tf.nn.conv2d Function: Given 4-D input and filter, the calculation of a 2-dimensional convolution results before several parameters are input, filter, strides, padding, Use_cudnn_on_gpu, ... The format of input is required for a tensor, [batch, In_height, In_width, In_channels], number of batches, image height, image width, channel number filter in the format [Filter_height, Filter_widt H, In_channels, Out_channels], filter height, width, number of input channels, number of output channels strides a list with a length of 4. Indicates that the distance padding after each convolution is slid in input has the same and valid two options, indicating whether the portion of the incomplete convolution is to be retained. If same, the USE_CUDNN_ON_GPU is reserved for use with cudnn acceleration. The default is True""" returntf.nn.conv2d (x, W, strides=[1, 1, 1, 1], padding='same')defmax_pool_2x2 (x):"""Tf.nn.max_pool The maximum pooling operation, and avg_pool the average pooling operation several parameters are: value, ksize, strides, padding, value: A 4D tensor in the format [bat CH, height, width, channels], as in conv2d input format ksize: A list of 4, indicating the size of the pooled window strides: The sliding value of the window, as in conv2d padding: with C Use the same onv2d. """ returnTf.nn.max_pool (x, Ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='same') Sess=TF. InteractiveSession () x= Tf.placeholder (Tf.float32, [None, 784]) X_image= Tf.reshape (x, [ -1,28,28,1])#enter the input according to the format of input in conv2d Reshape,reshape"""# The first layer # convolution core (filter) is 5*5, the number of channels is 1, the output channel is 32, that is, feature map number is 32# and because strides=[1,1,1,1] so the output size of a single channel should be the same as the input image. That is, the total convolution output should be? *28*28*32# that is, a single channel output of 28*28, a total of 32 channels, total number of batches # in the pooling phase, ksize=[1,2,2,1] so convolution results after pooling results, its size should be? *14*14*32"""W_CONV1= Weight_variable ([5, 5, 1, 32])#convolution is the 32 features in each 5*5 patch, namely patch size, number of input channels, number of output channelsB_CONV1 = Bias_variable ([32]) H_conv1= Tf.nn.elu (conv2d (X_image, W_CONV1) +b_conv1) H_pool1=max_pool_2x2 (H_CONV1)"""# Second Layer # convolution core 5*5, input channel is 32, output channel is 64. # The size of the image before the convolution is *14*14*32, after the convolution is? *14*14*64#, the output image size is? *7*7*64"""W_conv2= Weight_variable ([5, 5, 32, 64]) B_conv2= Bias_variable ([64]) H_conv2= Tf.nn.elu (conv2d (h_pool1, w_conv2) +b_conv2) H_pool2=max_pool_2x2 (H_CONV2)#The third layer is a fully connected layer, the input dimension 7*7*64, the output dimension is 1024x768W_FC1 = Weight_variable ([7 * 7 * 64, 1024]) B_fc1= Bias_variable ([1024]) H_pool2_flat= Tf.reshape (H_pool2, [-1, 7*7*64]) H_fc1= Tf.nn.elu (Tf.matmul (H_pool2_flat, W_FC1) +b_fc1) Keep_prob= Tf.placeholder (Tf.float32)#drop out is used here, which randomly arranges some cell output values of 0 to prevent overfittingH_fc1_drop =tf.nn.dropout (H_FC1, Keep_prob)#fourth level, input 1024-D, output 10-dimensional, that is, the specific 0~9 classificationW_FC2 = weight_variable ([1024, 10]) B_FC2= Bias_variable ([10]) Y_conv=tf.nn.softmax (Tf.matmul (H_fc1_drop, W_FC2) + B_FC2)#use Softmax as a multi-class activation functionY_ = Tf.placeholder (Tf.float32, [None, 10]) cross_entropy= Tf.reduce_mean (-tf.reduce_sum (Y_ * Tf.log (Y_CONV), reduction_indices=[1]))#loss function, cross entropyTrain_step = Tf.train.AdamOptimizer (1e-4). Minimize (Cross_entropy)#using Adam optimizationCorrect_prediction = Tf.equal (Tf.argmax (y_conv,1), Tf.argmax (y_,1))#Calculation Accuracyaccuracy =Tf.reduce_mean (Tf.cast (correct_prediction, Tf.float32)) Sess.run (Tf.initialize_all_variables ())#Initialization of variables forIinchRange (20000): Batch= Mnist.train.next_batch (50) ifi%100 = =0:#print (Batch[1].shape)Train_accuracy = Accuracy.eval (feed_dict={x:batch[0], Y_: batch[1], keep_prob:1.0}) Print("Step%d, training accuracy%g"%(i, train_accuracy)) Train_step.run (Feed_dict={x:batch[0], Y_: batch[1], keep_prob:0.5})Print("Test accuracy%g"%accuracy.eval (feed_dict={x:mnist.test.images, Y_: Mnist.test.labels, Keep_prob:1.0}))
Cnn:deep Network Example