TensorFlow tf.nn.conv2d () Introduction

Source: Internet
Author: User
Tags valid

One, the tf.nn.conv2d () function Python version definition:

tf.nn.conv2d (input, filter, strides, padding, use_cudnn_on_gpu=none, Name=none)
1. Parameter input:

An input image that needs to be convolution is a tensor,shape for [batch, In_height, In_width, in_channels], meaning [the number of pictures of a batch at training time, picture height, picture width, Number of image channels]. Note that this is a 4-D tensor, with data types of float32 and float64 one of them.

2. Parameter filter:

Equivalent to the convolution core in CNN, is a tensor,shape for [Filter_height, Filter_width, In_channels, out_channels], meaning [convolution core height, Convolution core width, number of image channels, convolution core number], the requirement type is the same as the first parameter input. Note that the third dimension of filter, In_channels, is the fourth dimension of parameter input.

3. Parameter strides:
To do convolution in the image of each dimension of the step, which is a 1-dimensional vector, the length of 4. Note: Be sure to ensure that strides[0] = strides[3] = 1, because the concept of step size is not required in batch and in_channels dimensions. And, in most cases, the step in the horizontal and vertical directions of the image is the same, i.e. strides = [1,stride,stride,1].

4. Parameter padding:

The amount of string type can only be "Same", "VALID" one of them, this value determines the different convolution mode.

When padding = same, the size (width and height) of the convolution output image is the same as the size of the input image.

Example: Add input image as input = [1, 3, 3, 1], as shown below:


convolution kernel filter = [2, 2, 1, 1], as shown below:


When padding = same, the function will first fill the input image with 0, as shown below:


Then the convolution (weighted sum of each position) is started, and the final result is as follows:


As you can see, both the input image and the output image are 3*3 images.

When padding = valid, the function does not populate the image with a 0 operation, that is, without padding. At this point, the output image size is less than the size of the input image.

5. Parameter Use_cudnn_on_gpu:

BOOL type, whether to use CUDNN acceleration, which is trueby default.

6. Return value:

The function returns a tensor, which is what we often call feature map.


Besides strides:

is not as long as padding= ' same ', then the output size after convolution is the same as the input. In fact, this is also related to the strides step length. When the step strides = 1 o'clock has no effect on the result. But not 1 o'clock, the output size will no longer be the same as the input.

Import TensorFlow as TF
DATA=TF. Variable (Tf.random_normal ([64,48,48,3]), Dtype=tf.float32)
WEIGHT=TF. Variable (Tf.random_normal ([5,5,3,64]), Dtype=tf.float32)
SESS=TF. InteractiveSession ()
Tf.global_variables_initializer (). Run ()
conv1=tf.nn.conv2d (data,weight,strides=[ 1,1,1,1],padding= ' same ')
conv2=tf.nn.conv2d (data,weight,strides=[1,2,2,1],padding= ' SAME ')
conv3= tf.nn.conv2d (data,weight,strides=[1,4,4,1],padding= ' same ')
print (CONV1) print (
conv2)
print ( CONV3)
The result is:

Tensor ("conv2d_6:0", shape= (up, up, up), Dtype=float32)
Tensor ("conv2d_7:0", shape= (+,,), dtype= float32)
Tensor ("conv2d_8:0", shape= (+, +, +), Dtype=float32)
It can be seen that there is a multiple relationship between the size of the output size and the stride of the convolution.

Second, the realization of the function of the process

For a given shape for [batch, In_height, In_width, In_channels], the tensor variable input, and the shape for [Filter_height, Filter_width, In_channels, Out_ Channels] The convolution kernel filter, the function tensorflow::ops::conv2d (c + + version definition, and the Python version for tf.nn.conv2d are corresponding) roughly perform the following steps:

Step1: Convert convolution core into shape for [filter_height * filter_width * in_channels, Output_channels] tensor

STEP2: Convert input data to shape for [batch, Out_height, Out_width, Filter_height * filter_width * In_channels] tensor

Step3: Execute as follows:

Output[b, I, j, K] =
    Sum_{di, DJ, q} input[b, strides[1] * i + di, strides[2] * j + DJ, Q] *
                    Filter[di, DJ, Q, K]









Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.