TensorFlow tf.nn.conv2d () Introduction

Last Update:2018-07-26 Source: Internet

Author: User

Tags valid

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

One, the tf.nn.conv2d () function Python version definition:

tf.nn.conv2d (input, filter, strides, padding, use_cudnn_on_gpu=none, Name=none)

1. Parameter input:

An input image that needs to be convolution is a tensor,shape for [batch, In_height, In_width, in_channels], meaning [the number of pictures of a batch at training time, picture height, picture width, Number of image channels]. Note that this is a 4-D tensor, with data types of float32 and float64 one of them.

2. Parameter filter:

Equivalent to the convolution core in CNN, is a tensor,shape for [Filter_height, Filter_width, In_channels, out_channels], meaning [convolution core height, Convolution core width, number of image channels, convolution core number], the requirement type is the same as the first parameter input. Note that the third dimension of filter, In_channels, is the fourth dimension of parameter input.

3. Parameter strides:
To do convolution in the image of each dimension of the step, which is a 1-dimensional vector, the length of 4. Note: Be sure to ensure that strides[0] = strides[3] = 1, because the concept of step size is not required in batch and in_channels dimensions. And, in most cases, the step in the horizontal and vertical directions of the image is the same, i.e. strides = [1,stride,stride,1].

4. Parameter padding:

The amount of string type can only be "Same", "VALID" one of them, this value determines the different convolution mode.

When padding = same, the size (width and height) of the convolution output image is the same as the size of the input image.

Example: Add input image as input = [1, 3, 3, 1], as shown below:

convolution kernel filter = [2, 2, 1, 1], as shown below:

When padding = same, the function will first fill the input image with 0, as shown below:

Then the convolution (weighted sum of each position) is started, and the final result is as follows:

As you can see, both the input image and the output image are 3*3 images.

When padding = valid, the function does not populate the image with a 0 operation, that is, without padding. At this point, the output image size is less than the size of the input image.

5. Parameter Use_cudnn_on_gpu:

BOOL type, whether to use CUDNN acceleration, which is trueby default.

6. Return value:

The function returns a tensor, which is what we often call feature map.

Besides strides:

is not as long as padding= ' same ', then the output size after convolution is the same as the input. In fact, this is also related to the strides step length. When the step strides = 1 o'clock has no effect on the result. But not 1 o'clock, the output size will no longer be the same as the input.

Import TensorFlow as TF
DATA=TF. Variable (Tf.random_normal ([64,48,48,3]), Dtype=tf.float32)
WEIGHT=TF. Variable (Tf.random_normal ([5,5,3,64]), Dtype=tf.float32)
SESS=TF. InteractiveSession ()
Tf.global_variables_initializer (). Run ()
conv1=tf.nn.conv2d (data,weight,strides=[ 1,1,1,1],padding= ' same ')
conv2=tf.nn.conv2d (data,weight,strides=[1,2,2,1],padding= ' SAME ')
conv3= tf.nn.conv2d (data,weight,strides=[1,4,4,1],padding= ' same ')
print (CONV1) print (
conv2)
print ( CONV3)

The result is:

Tensor ("conv2d_6:0", shape= (up, up, up), Dtype=float32)
Tensor ("conv2d_7:0", shape= (+,,), dtype= float32)
Tensor ("conv2d_8:0", shape= (+, +, +), Dtype=float32)

It can be seen that there is a multiple relationship between the size of the output size and the stride of the convolution.

Second, the realization of the function of the process

For a given shape for [batch, In_height, In_width, In_channels], the tensor variable input, and the shape for [Filter_height, Filter_width, In_channels, Out_ Channels] The convolution kernel filter, the function tensorflow::ops::conv2d (c + + version definition, and the Python version for tf.nn.conv2d are corresponding) roughly perform the following steps:

Step1: Convert convolution core into shape for [filter_height * filter_width * in_channels, Output_channels] tensor

STEP2: Convert input data to shape for [batch, Out_height, Out_width, Filter_height * filter_width * In_channels] tensor

Step3: Execute as follows:

Output[b, I, j, K] =
    Sum_{di, DJ, q} input[b, strides[1] * i + di, strides[2] * j + DJ, Q] *
                    Filter[di, DJ, Q, K]

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More