ImageNet? Classification?with? Deep? Convolutional? Neural? Networks? Read notes reproduced

Source: Internet
Author: User

ImageNet classification with deep convolutional neural Networks reading notes(2013-07-06 22:16:36) reprint
Tags: deep_learning imagenet Hinton Category: machine learning
(after deciding to read a paper each time, the notes are recorded on the blog.) )
This article, published in NIPS2012, is Hinton and his students are using deep learning in response to doubts about deep learning.ImageNet(Image recognition is currently the largest database), the result is very impressive, and the results are much better than the original state of the art (the first 5 selection error rate is reduced from 25% to 17%).

Imagenet currently contains about 22000 types of calibrated images with about 15 trillion. Among them, the most commonly used LSVRC-2010 contest contains 1000 classes, 1.2 trillion images. The result of this paper is the result of 17% of the first five error rate on this test set.

The structure of the whole deep net is given:

A total of 8 layers, of which the first 5 is CNN, and the back 3 is an all-connected network, the last layer is the softmax composition of the output decision-making (the number of output nodes equals the number of categories 1000).
Concrete implementation, this article on the structure of some of the improvements are: 1, the use of Relu to replace the traditional Tanh introduced nonlinearity, 2, the use of 2 video card for parallel computing, reduce the need for more video card host data transfer time consumption, in the structure, There is no connection between the front and back layer nodes distributed on different graphics cards, which improves the training speed; 3, the local normalization of the response of the adjacent nodes improves the recognition rate (TOP5 error rate is reduced by 1.2%), 4, the overlapping pooling (TOP5 error rate is reduced by 0.3%);
In addition, in order to reduce the over-fitting, the article uses two ways: 1, data enhancement: the training data to the left and right symmetry and translation transformation, the training data to increase the original 2048 times times; The PCA transformation of the pixel to construct a new sample (this mechanism makes the TOP5 error rate decreased by 1); 2, Dropout:
Optimization algorithm: Optimized with Mini-batch SGD algorithm, each batch128 sample, momentum = 0.9, weight decay = 0.0005

Random initialization weights and bias (see paper for specific random parameters)
Paper Link: http://books.nips.cc/papers/files/nips25/NIPS2012_0534.pdf
Source Address: http://code.google.com/p/cuda-convnet/
Source: >  

From for notes (Wiz)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.