Talk about how to train a well-performing deep neural network

Source: Internet
Author: User

Talk about how to train a well-performing deep neural network

Deep learning fires, the state of the art of each data set is constantly refreshed, to the release of open source code, there is a universal can brush ranking rhythm.

But do not think of the brush data so simple, otherwise we go to which hair paper, how bread where eat = = but I do not want to send paper want to occupy the pit brush data How to do, see Cifar10 are 95%, I this with the small demo of Caffe bring 78% results, Caffe, are you sure you're not kidding me?

Caffe did not lie to you. = = Today I would like to introduce you how to brush a performance close to the paper neural network

CNN, for example, is basically divided into three steps:

The first step is to use the leaky relu,dropout (see blog.kaggle.com/2015/01/02/ cifar-10-competition-winners-interviews-with-dr-ben-graham-phil-culliton-zygmunt-zajac/)

The second step, data disturbance, the data will be shifted up and down, zoom in and out, Pan green, redness, anti-color and so on, do a lot of reasonable disturbance,

The third step, the fixed step learning, until the training is not moving, to find a high-precision solverstate as a starting point, the learning rate will be reduced training, supposedly reduced to 1e-4 training almost

In fact, when you study more found that the real improvement in performance is the second step, the other can only be said to be icing on the cake, the data disturbance is fundamental, of course, this also reveals the classifier itself defects.

Of course, someone asked, you network structure has not yet, this well, paper and experiment contact more, oneself naturally will design, I think network structure is not the main, because CNN's fatal flaw other classifiers also have, to solve can only say is all together solve.

Mnist I data disturbance to the results of the brush to 99.58%, the structure is very simple rough no brain, cifar10 disturbance too little is just 88%, long 90% should be very easy, ImageNet, hehe, Look at Jin Lianwen teacher Weibo on Baidu on the imagenet on the comments you know what I want to say.

(Baidu brushes the indicator to 4.58%, the main job is (1) more silver (144 GPU Clusters) (2) A larger network (6 16-tier 212M Model Integration) (3) more data (each picture is composed of tens of thousands of changes)-Jin Lianwen)

Talk about how to train a well-performing deep neural network

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.