Deeply supervised NETS
fig1:deeply-supervised Nets architecture and cost functions illustration
Abstract
We propose deeply-supervised nets (DSN), a method that simultaneously minimizes classification error while improving the D Irectness and transparency of the hidden layer learning process. We focus our attention on three specific aspects in traditional convolutional-neural-network-type (Cnn-type) architectures : (1) Transparency in the effect that intermediate layers has on the overall classification; (2) Discriminativeness and robustness of learned features, especially in early network layers; (3) Training effectiveness in the face of "exploding" and "vanishing" gradients. To combat these issues, we introduce "companion" objective functions on each individual hidden layer, in addition to the O Verall objective function at the output layer (a strategy distinct from layer-wise pre-training). We also analyze our algorithm using techniques extended from stochastic gradient methods. The advantages provided by our method is evident in we experimental results on benchmark datasets, showing StAte-of-the-art performance on MNIST, CIFAR-10, CIFAR-100, and Svhn.
Experiments
Our results on several benchmark datasets is listed below:
Fig2:test accuracy on different benchmark datasets.
Fig3:visualization of the convolutional feature map learned by DSN
Code, preprocessed data and configuration files are available.
GET IT on GITHUB
Publication
[1] deeply-supervised Nets [PDF][ARXIV version]
Chen-yu lee*, saining xie*, Patrick Gallagher, zhengyou Zhang, Zhuowen Tu
(* indicates equal contributions) In Proceedings of Aistats 2015
An early and undocumented version presented at deep learning and representation learning Workshop, NIPS 2014
Disclosure
deeply-supervised Neural Networks, Zhuowen Tu, Chen-yu Lee, saining Xie,
UCSD Docket No. sd2014-313, May 22, 2014
Deeply supervised NETS