Deep Learning Basics Series (i) | Understand the meanings of each layer of building a model with KERSA (the calculation method of mastering the output size and the number of parameters that can be trained)

Last Update:2018-09-10 Source: Internet

Author: User

Tags keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

When we learn the mature network model, such as Vgg, Inception, ResNet, etc., the first question is how to set the parameters of each layer of these models? In addition, if we want to design our own network model, how to set the parameters of each layer? If the model parameter setting error, in fact, the model also often can not run.

Therefore, we need to first understand the meaning of each layer of the model, such as the output size and the number of training parameters. After understanding, everyone in the design of their own network model, you can first draw a web flowchart on paper, set the parameters, calculate the output size and the number of training parameters, and finally can be encoded as a realization.

In Keras, when we build a model or get a mature model, we can often observe the information of each layer of the model through Model.summary ().

This article will be illustrated by a simple example. This example is based on a simple model vgg-like model of Keras official website, with a slight change of code as follows:

 fromTensorFlowImportKeras fromTensorflow.keras.modelsImportSequential fromTensorflow.keras.layersImportdense, dropout, Flatten fromTensorflow.keras.layersImportconv2d, maxpool2d (Train_data, Train_labels), (Test_data, Test_labels)=keras.datasets.mnist.load_data () train_data= Train_data.reshape (-1, 28, 28, 1)Print("train Data type:{}, shape:{}, dim:{}". Format (Type (train_data), Train_data.shape, Train_data.ndim))
# The first set of model=sequential () Model.add (conv2d (Filters=32, Kernel_size= (3, 3), strides= (1, 1), padding='valid', activation='Relu', Input_shape= (28, 28, 1)) Model.add (conv2d (Filters=32, Kernel_size= (3, 3), strides= (1, 1), padding='valid', activation='Relu')) Model.add (maxpool2d (pool_size= (2, 2)) ) Model.add (Dropout (0.25))
# The second group of Model.add (conv2d (Filters=64, Kernel_size= (3, 3), strides= (1, 1), padding='valid', activation='Relu')) Model.add (conv2d (Filters=64, Kernel_size= (3, 3), strides= (1, 1), padding='valid', activation='Relu')) Model.add (maxpool2d (pool_size= (2, 2)) ) Model.add (Dropout (0.25))
# Third Group Model.add (Flatten ()) Model.add (Dense (units=256, activation='Relu')) Model.add (Dropout (0.5)) Model.add (Dense (units=10, activation='Softmax') ) model.summary ()

The data in this example is from Mnist, which is the size of 28*28, the number of channels is 1, that is, only black and white two-color pictures. Where the convolution layer parameter means:

Filters: Indicates the number of filters, each filter will be with the corresponding input layer convolution operation;
Kernel_size: Indicates the size of the filter, generally odd value, such as 1,3,5, which is set to 3*3 size;
Strides: Indicates the step, that is, the number of steps each time the filter moves on the picture;
padding: Indicates whether to fill pixels at the edge of the picture, there are generally two values to choose from, one is the default valid, which means that the image size will be smaller when the convolution is not filled, the other is the same, fill the pixel, so that the output size and input size are consistent.

If you choose valid, assuming that the input size is n * N, the size of the filter is F * F, and the step is s, the size of the output picture is: [(n-f)/s + 1] * [(N-F)/s + 1)], if the calculation result is not an integer, then rounding down;

If you select Same, assuming that the input size is n * N, the size of the filter is F * F, and the edge pixel width to fill is p, then the formula for P is: n + 2p-f +1 = n, and finally P = (f-1)/2.

Running the above example, you can see the following results:

Train data type:<class ' Numpy.ndarray ', Shape: (60000, 1), dim:4________________________________________ _________________________layer (type) Output Shape Param # ================================ =================================conv2d (conv2d) (None, 26, 26, 32) 320 ________________________ _________________________________________conv2d_1 (conv2d) (None, 24, 24, 32) 9248 ________________ _________________________________________________max_pooling2d (Maxpooling2d) (None, 12, 12, 32) 0 ________         _________________________________________________________dropout (Dropout) (None, 12, 12, 32) 0 _________________________________________________________________conv2d_2 (conv2d) (None, 10, 10, 64) 18          496 _________________________________________________________________conv2d_3 (conv2d) (None, 8, 8, 64) 36928 _________________________________________________________________max_pooling2d_1 (MaxPooling2 (None, 4, 4, 64) 0 ____         _____________________________________________________________dropout_1 (Dropout) (None, 4, 4, 64) 0              _________________________________________________________________flatten (Flatten) (None, 1024)               0 _________________________________________________________________dense (dense) (None, 256) 262400 _________________________________________________________________dropout_2 (Dropout) (None, 2              0 _________________________________________________________________dense_1 (dense) (None, ten) 2570 =================================================================total params:329,962 Trainable params:329,962non-trainable params:0

Let us read, first mnist for input data, size size (60000, 28, 28, 1), this is a typical NHWC structure, i.e. (number of pictures, width, height, number of channels);

Secondly, we need to pay attention to the output shape of the table, which follows the same structure as mnist, except that the first bit is usually none, indicating that the number of pictures is to be determined, and the following three digits are calculated according to the above rules;

The final concern is that the "param" can be trained in the number of parameters, different model layer calculation method is not the same:

For the convolution layer, assuming that the filter size is f * F, the number of filters is n, if bias is turned on, the number of bias is fixed to 1, the number of channels for the input image is C, then the PARAM formula = (f * F * C + 1) * n;
For the pooling layer, flatten, dropout operation, it is not necessary to train parameters, so param is 0;
For fully connected layers, assuming that the input column vector size is I, the output column vector size is O, if bias is turned on, the Param is calculated as =i * o + O

The output size and number of training parameters can be computed as shown in the three set of model hierarchies divided by code:

First group:

Second group:

Third group:

At this point, the meaning of each layer of the model and the relevant calculation methods have been introduced, I hope this article can help you better understand the composition of the model and related calculations.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More