Record some small knowledge points in neural networks

Source: Internet
Author: User

Record some of the small points in the neural network blob dimensions in 1 caffe
    • The BLOBs in Caffe have 4 dimensions, respectively num,channel,width and height;
    • In which we define each network layer, a commonly used parameter numout, is the designated channel;
    • For example, the dimension is 1*3*5*5 data input network (that is, each time input a 5*5 size 3-channel graph), after a stride for 2,pad 1,kernel for 2,numout to 2 of the convolutional layer, the dimension becomes 1*2*3*3;
      • If the input has n channels, when the calculation, the Caffe will produce n filter to multiply with it, and then the resulting n product added up, the calculation process finally produced a channel;
      • If the output of the numout is M, there will be m of the above calculation process;
      • can refer to
    • This is why we often see the use of 1*1kernel-sized convolutional layers in some networks, which is why googlenet is also used to reduce network parameters.
2 Role of 1*1 convolution nuclei
  • Enables cross-channel interaction and information integration.
    • The convolution in CNN is mostly a multi-channel operation between feature map and multi-channel convolutional cores (as mentioned in 1th above, input multi-channel feature map and corresponding number of convolution cores do convolution summation, then output 1 channels of feature map);
    • If a 1x1 convolution kernel is used, this operation realizes the linear combination of multiple feature maps, which can realize the change of feature map in the number of channels;
    • With the activation function, the network in network structure can be implemented after the normal convolution layer.
  • The number of convolutional cores is reduced and ascending dimension.

    • Googlenet, for each inception module (for example), the original module is a graph, B is added to the 1x1 convolution for dimensionality reduction.
    • Take the Googlenet 3a module as an example, the input feature map is a 1x1 convolution channel in the 28X28X192,3A module for the 64,3x3 convolution channel for the 128,5x5 convolution channel of 32, if it is a graph structure, then the convolution kernel parameter is 1x1x192x64+3x 3x192x128+5x5x192x32, and a 1x1 convolution layer of channel number 96 and 16 was added to the 3x3 and 5x5 convolution layers before the B figure, so that the convolution kernel parameter becomes 1x1x192x64+ (1x1x192x96+3x3x96x128) + (1x1x 192x16+5x5x16x32), the parameter is reduced approximately to the original One-third.
    • Googlenet uses 1x1 convolution to reduce the dimension, get a more compact network structure, although there are a total of 22 layers, but the number of parameters is only 8 layers of alexnet one-twelveth (of course, there is a large part of the reason is to remove the full connection layer).

    • ResNet also uses the 1x1 convolution, and is used before and after the 3x3 convolution layer, not only to reduce dimensions, but also to raise dimensions, so that the convolution layer input and output channel number are reduced, the number of parameters further reduced, such as the structure.

  • In addition, the last time you trained FCN on Titan X, after iterating 1500 times, the memory exploded.

    • Consider should be the number of network connections too much, will caffemodel save, found that there is 1.2G;
    • Finetuning, if the loaded model is too large, you can consider the iteration of a small number of times, save the Caffemodel, and then load the new saved Caffemodel to finetuning, because the weight is also very memory-consuming, If the finetuning only use the inside of the several layers, and loaded into the full volume of these connections of the huge layer, not cost-effective, it is recommended to overload the model as soon as possible;
    • Try to 1*1 convolutional cores to see if you can reduce the number of network connections to an acceptable range.
3 Feeling Wild calculation
  • The receptive field is the size of the area on the original image that is mapped by the pixels on the feature map (feature map) of each layer of output of the convolutional neural network.
  • , the map 3 1x1 region corresponds to map 2 of the receptive field is the area of the red 7x7, and the map 2 7x7 region corresponds to map 1 of the receptive field is the Blue 11x11 area, so the map The 3 1x1 area corresponds to the receptive field of map 1, which is the area of the blue 11x11.
  • Calculation method
    • For convolution/pooling layer: r i = s i ?(r i + 1 ?1)+ k i
    • For neuron layer (Relu/sigmoid/...): r i =r i + 1
    • which r i An area representing the input of layer I layers, s i Represents the step size of layer I, k i Represents the kernel size.
  • Attention
    • When calculating the sensing field, the influence of the edge of the image is neglected, i.e. the size of the padding is not considered;
    • The size of the deepest sensing field is equal to the size of the filter;
    • In the calculation of the use of top to down, that is, the deepest in the first layer of the field of perception, and then gradually passed to the first layer.
4 Caffe do fine-tuning precautions
    • You may encounter an error when fine-tuning:
      • Phenomenon:[Caffe]: Check failed: ShapeEquals(proto) shape mismatch (reshape not set)
      • Cause: The loaded Caffemodel will be assigned a parameter based on the name of the layer in the prototxt, but if you change the parameters of the layer (including but not limited 输入数据的维度 to, numout etc.), the Caffemodel will be difficult to match the new network;
      • WORKAROUND: If you modify the network parameters, you need to modify the name of the corresponding layer, so that the load Caffemodel will not load Pre-train parameters, thus avoiding parameters and data mismatch; Some structures that do not have the Pre-train parameter loaded are reinitialized.
5 FCN generated probability map with granular/mosaic
    • Using FCN to do the segmentation, the resulting probability graph is not continuous, there is a clear sense of grain or mosaic.
      In order: normal continuous probability graph---mosaic probability map

    • Solutions

      • Transpose the stride of the convolutional layer is best not set to kernal equal, preferably smaller, such as 1/2;
      • When the top of a plurality of different stages of the transpose convolution layer is concat, the top of the transpose convolution layer and the top of the nearest convolution layer are eltwise operated, and then the concat is followed, otherwise the mosaic probability map will appear when predicted.
6 Visual gadget Recommendations
    • Quick Start-netscope

      • A small project on GitHub that can visualize Caffe network files (prototxt);
      • Although Caffe has a visual Python interface, this tool modifies the network structure in the left column, which can be Shift+Enter visualized by a key in the right column, a bit of a compiler's taste;
      • Such as
    • 3D visualization of a convolutional neural Network

      • This is a lenet feature visualizer, in the upper left corner of the box enter a number, you can see lenet each layer to learn what the characteristics are.

This blog reference from
What is the role of the CAFFECN community (caffecn.cn)-1x1 convolution nucleus? 》

Record some small knowledge points in neural networks

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.