Application test of compression and squeezenet of neural network model

Last Update:2018-07-24 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Deep learning has made great breakthroughs in many fields, but the existing models for deep learning training are often larger, such as imagenet or Coco, the various training models above are often more than hundreds of M, which is certainly not a problem for the existing mainstream computers, But for some mobile devices or some hardware level applications may be more difficult. Therefore, the neural network compression is also an important part in the research and application of depth learning.

A speech at Microsoft Qin Tao (http://www.msra.cn/zh-cn/news/blogs/2017/03/tao-qin-machine-learning-20170309.aspx) Mentioned: The existing network compression has four main categories.

One is called pruning, you know, the neural network is mainly composed of a layer of nodes through the edge of the connection, each side some weight. The meaning of pruning is very simple, if we find that some side of the weight is very small, such edges may not be important, these edges can be removed. After we've trained the big model, we'll see which side weights are smaller, remove the edges, and then train the model on the side of the reservation.

Another way to model compression is through weight sharing. Assuming that the adjacent two layers are fully connected, each layer has 1000 nodes, then there are 1000 times 1000 or 1 million weights (parameters) between the two layers. We can do a cluster of 1 million weights to see which weights are very close, we can use the mean of each class instead of these weights that belong to this category, so many sides (if they are clustered in the same class) share the same weights. If we cluster 1 million numbers into 1000 categories, we can reduce the number of parameters from 1 million to 1000, which is also a very important technique for compressing the size of a model.

There is also a technology that can be thought of as weight sharing goes further, called quantification. The parameters of the depth neural network model are expressed by the number of floating-point types, and the 32bit-length floating-point number. In fact, there is no need to retain such high precision, we can quantify, for example, using 0 to 255 to express the original 32 bit of precision, by sacrificing precision to reduce the amount of space required for each weight.

The more extreme approach to this quantization is the fourth kind of technology called the two-way neural network. The so-called two-system neural network, is that all the weights do not have to express the floating-point numbers, is a binary number, or +1 or 1, in binary way to express, so that the original bit weight now only need a bit to express, thus greatly reducing the size of the model.

In the study of depth model compression this piece, more famous is squeezenet, the most important thing is that the effect is quite good. It is a classic compression model for the compression of Google Alexnet. Its main approach is to convert the original 3*3 conv into 1*1 conv, which can reduce a lot of parameters.

Of course, there are more than the above improvements, as well as the operations in the expand Extension module, as well as interspersed pool operations similar to residual processing. In particular, the final full connection layer of the replacement.

(2) test

Based on the squeezenet test, found that the training model is basically 4.8M, has been very small, online and directly to the parameter type int, etc., can control the size of the 0.8M or so.

Or do you test with this multiple scene type:

Its identification results are as follows: (recognition speed is very fast)

Test Picture two:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Application test of compression and squeezenet of neural network model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Application test of compression and squeezenet of neural network model

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support