Microsoft Wins imagenet 2015 through feedforward lstm without Gates

Source: Internet
Author: User

Microsoft dominated the Imagenet 2015 contest with a deep neural the network of layers [1]. Congrats to kaiming it & Xiangyu Zhang & shaoqing Ren & Jian Sun on the great results [2]!

Their CNN layers Compute G (F (x) +x), which is essentially a feedforward Long short-term Memory (LSTM) [3] without gates!

Their net is similar to the very deep highway Networks [4] (with hundreds of layers), which, are feedforward Lstms with Forget gates (= gated recurrent units) [5].

The authors mention the vanishing gradient problem, but do not mention my very-a-student Sepp (now Hochreiter R) who identified and analyzed this fundamental deep learning problem in 1991, years before. else anybody [6].

Apart from the above, I liked the paper [1] a lot. LSTM Concepts Keep invading CNN territory [e.g., 7a-e], also through gpu-friendly multi-dimensional LSTMS [8].

References

[1] kaiming He, Xiangyu Zhang, shaoqing Ren, Jian Sun. Deep residual for Image Learning. arxiv:1512.03385

[2] imagenet Large Scale Visual recognition Challenge 2015 (ILSVRC2015): Results

[3] S. Hochreiter, J. Schmidhuber. Long short-term Memory. Neural Computation, 9 (8): 1735-1780, 1997. Based on TR fki-207-95, TUM (1995). Pdf. Led to a lot of follow-up work, and are now heavily used by leading IT companies all on the world.

[4] R. K. Srivastava, K. Greff, J. Schmidhuber. Training Very Deep Networks. NIPS 2015;arxiv:1505.00387.

[5] F. Gers, J. Schmidhuber, F. Cummins. Learning to forget:continual prediction with lstm. Neural Computation, 12 (10): 2451-2471, 2000. Pdf.

[6] S. Hochreiter. Untersuchungen zu dynamischen Neuronalen Netzen. Diploma Thesis, TU Munich, 1991. Advisor:j. Schmidhuber. Overview.

[7a] 2011:first superhuman CNNs
[7b] 2011:first human-competitive CNNs for handwriting
[7c] 2012:first CNN to win segmentation contest
[7d] 2012:first CNN to win contest on object discovery in large images
[7e] Deep Learning. Scholarpedia, 10 (11): 32832, 2015

[8] M. Stollenga, W Byeon, M. Liwicki, J. Schmidhuber. Parallel multi-dimensional lstm, with application to Fast biomedical Image volumetric. NIPS 2015; arxiv:1506.07452.


Source: http://people.idsia.ch/~juergen/microsoft-wins-imagenet-through-feedforward-lstm-without-gates.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.