Learning notes TF014: convolution layer, activation function, pooling layer, normalization layer, advanced layer, and tf014 pooling
The CNN Neural Network Architecture contains at least one convolution layer (tf. nn. conv2d ). Single-layer CNN detection edge. Image Recognition and classification. Different layer types support convolution layers to reduce overfitting, accelerate the training process, and reduce memory usage.
TensorFlow accelerates convolution of all different classes. Tf. nn. depthwise_conv2d: One convolution layer outputs edge to another convolution layer input, create a rethininking the Inception Architecture for Computer Vision
Https://arxiv.org/abs/1512.00567. Tf. nn. separable_conv2d: large-scale models do not sacrifice accuracy to accelerate training. Small-scale models quickly converge but have low accuracy. Tf. nn. conv2d_transpos. The convolution kernel is used for the new feature graph. Each part is filled with the same convolution kernel value. The convolution kernel traverses the new image and adds the overlapping parts. Stanford University course CS231n Winter 2016: Lecture 13.
Activate the function and other layer output to generate a Feature Map, smooth (differential) Some operation results, introduce non-linear (input/output curve relationship) to the neural network, portray complex input changes, and train complex models. The main factor for activating a function is monotonic. The output increases with the input. You can use the gradient descent method to find the local extreme point. You can use the differential method. Any point in the definite domain has a derivative, and the output can use the gradient descent method.
Tf. nn. relu, correction linear unit, slope function. Piecewise linearity, the input is the same as the non-negative output, and the input is negative output is 0. Not Affected by gradient disappearance. value range: [0, + ∞]. High learning rates are susceptible to saturated neurons. Loss of information but outstanding performance. Input rank 1 tensor (vector). If it is less than 0, 0 is set, and other components remain unchanged.
Tf. sigmoid: receives only floating-point numbers and returns values in the range [0.0, 1.0. If the input value is large, the return value is close to 1.0. If the input value is small, the return value is close to 0.0. It is applicable to scenarios where the actual output is located at [0.0, 1.0]. The input is close to saturation or changes dramatically, and the output range is reduced to a problem. Input 0, output 0.5, sigmoid function value center point.
Tf. tanh, hyperbolic tangent function, value range [-1.0, 1.0], has the ability to output negative values. The Intermediate Value is 0.0. The input value of the lower network layer is expected to be negative or 0.0, which may cause problems.
Tf. nn. dropout, which is set to 0.0 Based on the configurable probability output. Suitable for training with a small amount of randomness. The keep_prob parameter specifies the output persistence probability. The output varies with each execution. The discarded output is set to 0.0.
Pooling layer reduces overfitting, reduces input size, and improves performance. Input downsampling to reserve important information for subsequent layers. The efficiency of reducing the size of the pooled layer is higher than that of tf. nn. conv2d.
Tf. nn. max_pool, skip traversal tensor, And the convolution core overwrites the maximum value of the element for convolution. It is applicable to the relationship between gray scale of input data and image importance. The input is the previous layer of output, not a direct image. SPAN strides uses image_height and image_width to traverse input. Only the maximum element of the input tensor is retained. Max-pooling is achieved by using the accepted domain (convolution kernel. 2X2 accepted domain, minimum number of downsampling for a single path. 1x1 accepted domain, same output input.
Tf. nn. avg_pool, the Skip traversal tensor, And the convolution kernel overwrites the depth values to get the average value. Suitable for important convolution kernels and reduced the value. For example, the input tensor has a large width and a small depth.
Tf. nn. relu is an unbounded function that normalizes and recognizes high-frequency features. Tf. nn. local_response_normalization (tf. nn. lrn), local response normalization, given vector, each component is overwritten by depth_radius input weighting and division. The input is kept within the acceptable range. Consider the importance of each value. The normalized output is adjusted to the interval [-1.0, 1.0].
The advanced layer reduces code redundancy and follows best practices.
Tf. contrib. layers. convolution2d. Initialization of weights, initialization of offsets, output of trainable variables, addition of offsets, and addition of activation functions. Convolution core, trainable variable. The weight value Initialization is used to fill the first running value of the convolution kernel (tf. truncated_normal ). The height and width of the convolution kernel are expressed in simple tuples. Input the image, tf. image. convert_image_dtype, and adjust each component to indicate the color value. TensorFlow requires the floating point type to describe the image color, with the component in [0, 1].
Tf. contrib. layers. fully_connected. Full connection layer. Each input/output is connected. The last layer of CNN is usually the full connection layer. TensorFlow full connection layer format, tf. matmul (features, weight) + bias. The input tensor is connected to each neuron in the output layer.
The original input must be passed to the input layer. Target Recognition and classification input layer tf. nn. conv2d.
import tensorflow as tf features = tf.range(-2, 3) print features sess = tf.Session() print sess.run([features, tf.nn.relu(features)]) features2 = tf.to_float(tf.range(-1, 3)) print features2 print sess.run([features2, tf.sigmoid(features2)]) print sess.run([features2, tf.tanh(features2)]) features3 = tf.constant([-0.1, 0.0, 0.1, 0.2]) print features3 print sess.run([features3, tf.nn.dropout(features3, keep_prob=0.5)]) batch_size = 1 input_height = 3 input_width = 3 input_channels = 1 layer_input = tf.constant([ [ [[1.0], [0.2], [1.5]], [[0.1], [1.2], [1.4]], [[1.1], [0.4], [0.4]] ] ]) print layer_input kernel = [batch_size, input_height, input_width, input_channels] print kernel max_pool = tf.nn.max_pool(layer_input, kernel, [1, 1, 1, 1], "VALID") print max_pool print sess.run(max_pool) layer_input2 = tf.constant([ [ [[1.0], [1.0], [1.0]], [[1.0], [0.5], [0.0]], [[0.0], [0.0], [0.0]] ] ]) print layer_input2 avg_pool = tf.nn.avg_pool(layer_input2, kernel, [1, 1, 1, 1], "VALID") print avg_pool print sess.run(avg_pool) layer_input3 = tf.constant([ [ [[1.], [2.], [3.]] ] ]) print layer_input3 lrn = tf.nn.local_response_normalization(layer_input3) print lrn print sess.run([layer_input3, lrn]) image_input = tf.constant([ [ [[0., 0., 0.], [255., 255., 255.], [254., 0., 0.]], [[0., 191., 0.], [3., 108., 233.], [0., 191., 0.]], [[254., 0., 0.], [255., 255., 255.], [0., 0., 0.]] ] ]) print image_input conv2d = tf.contrib.layers.convolution2d( image_input, num_outputs=4, kernel_size=(1,1), activation_fn=tf.nn.relu, stride=(1,1), trainable=True) print conv2d sess.run(tf.global_variables_initializer()) print sess.run(conv2d) features4 = tf.constant([ [[1.2], [3.4]] ]) print features4 fc = tf.contrib.layers.fully_connected(features4, num_outputs=2) print fc sess.run(tf.global_variables_initializer()) print sess.run(fc)
References:
TensorFlow practices for Machine Intelligence
Welcome to join me: qingxingfengzi
My public account: qingxingfengzigz
My wife Zhang Xingqing's Public Account: qingqingfeifangz