Estimating the size of Caffemodel files through the network structure _caffe

Source: Internet
Author: User

Although before have probably thought about the size of the Caffemodel method, but has not personally calculated, the recent whim, the matter to dry, the following is my calculation method, here and friends to share the exchange.

Caffemodel is a document produced during the training process, which mainly contains the W and b parameters of each layer in the network model, and also holds some other information such as the network shape. So we can see that the size of the Caffemodel depends largely on the number of W and b parameters of the model.

The number of W and b parameters is mainly determined by the following two factors: the network structure. For example: The number of convolution layer, the total number of connections, volume kernel size, number and so on; network input. This factor needs to be considered when the network contains a fully connected layer, and I will explain it in conjunction with examples below.

Here's a simple example:
Suppose the network has 10,000 W and b parameters, which are represented by a variable of float type (4 Bytes), and the size of the Caffemodel will be approximately 4*10000=40000 Bytes (slightly larger, Because in addition to storing parameters in Caffemodel, there are some other information like the network shapes mentioned above.

A specific example of a caffe (Mnist lenet_train_test.prototxt) is given below:
Use the Link1 method to draw the network model diagram as follows (the picture is somewhat small ~ ~ The specific number can refer to Lenet_train_test.prototxt):

The network mainly has two convolution layer and two full connection layer, can simplify the following diagram (two convolution cores are 5*5, step stride are 1, two pool layer are 2*2, step is 2):

The following is a calculation of the W and b parameters for each layer, (if the number of parameters is not well understood, see the following blog post: link2)

Conv1:w Quantity: 5*5*1*20=500 B Number: 20

Conv2:w Quantity: 5*5*20*50=25000 B Number: 50

Ip1:w Quantity: 1*1* (4*4*50) *500=400000 B number: 500

Ip2:w Quantity: 1*1*500*10=5000 B Number: 10

Add the parameters of the above layers to the following:
(500 + 20) + (25000 + 50) + (400000 + 500) + (5000 + 10) = 431080
There are 431,080 W and b parameters in total, because each parameter is stored in float type (4 Bytes), so the space required to store so many parameters is:
431080 * 4 = 1724320 (Byte) is approximately equal to 1.64 MB.

The calculated results are about the same size as the trained Caffemodel (a little bit smaller).

Here, we basically explain how to estimate the size of a caffemodel. It also sold a note, mentioned that the W and B parameters in addition to the network structure, but also with the input of the network.
In the mnist example above, if the input is not 28*28 but N*n (where n is an integer larger than 28), assume that pool2 output is represented as N*n (where the network structure is unchanged, where n is greater than 4), As a result, the number of W parameters in the IP1 fully connected layer is increased (the number of W parameters of Ip1 is n*n*50*500), which results in the change of caffemodel size.

As you can see from the above calculations, the size of a network depends largely on the full connection layer, and the number of connections (the number of parameters) in the first fully connected layer is generally the most. Later, "Network in Network" to replace the full connection layer average pooling, the purpose is to reduce the number of parameters. Interested friends can search this paper to see.

That's probably it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.