Wide Residual network--wrn

Last Update:2018-08-02 Source: Internet

Author: User

Tags keras

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

wide residual networks--wide residual network

Article from Sergey Zagoruyko's wide residual network residual networks--residual

In recent years, the residual network (residual networks) has achieved good results on each test set, and its network structure is shown in the following diagram.
Its network network is composed of residual module

The jump-Link can pass the residuals well in the past, thus avoiding the phenomenon of gradients disappearing.
However, the depth of the residual network too much to pursue the depth of the network, but ignored the problem of the module itself. With the increase of the module, the performance of the model is not significantly improved, so it can be understood as part of the module actually does not play its due role,

As gradient flows through the network there is nothing to force it to go through residual block weights and ti can avoid l Earning anything during training, so ti are possible that there are either only a few blocks that learn useful representatio Ns

Therefore, the author of this paper wants to propose a more effective way to improve the effect of the residual module.

Out goal are to explore a much richer set of the network architectures of ResNet Blocks and thoroughly examine how several othe R different aspects Besides the order of activations affect performance. wide residual network--wrn

Wrn adds a coefficient k on the basis of the original residual module, thus widening the number of convolution cores. As explained in the article, this reduces the number of layers, but does not reduce the model parameters and speeds up the calculation.

In particular, we present wider deep residual networks this significantly improved, having the times less layers and being More than 2 times faster.

The model structure is shown in the following table

Experiment

The experimental results given in the paper

Code

    #-*-Coding:utf-8-*-"" "Created on Tue Nov 20:43:10 @author: Sky_gao" "" from Keras.models import Model From keras.layers import Input, Add, Activation, dropout, Flatten, dense from keras.layers.convolutional import convolutio n2d, Maxpooling2d, averagepooling2d from keras.layers.normalization import batchnormalization from Keras import backend a
                      S-K def initial_conv (input): x = convolution2d (3, 3), padding= ' same ', kernel_initializer= ' he_normal ', Use_bias=false) (input) Channel_axis = 1 if k.image_data_format () = = "Channels_first" else-1 x = Ba Tchnormalization (Axis=channel_axis, momentum=0.1, epsilon=1e-5, gamma_initializer= ' uniform ') (x) x = Activation (' Relu ') (x) return x def expand_conv (init, base, K, strides= (1, 1)): x = convolution2d (base * k, (3, 3), padding= ' Same ', strides=strides, kernel_initializer= ' He_normal ', Use_bias=false) (init) Channel_axis = 1 If K.image_data_format () = = "chAnnels_first "Else-1 x = Batchnormalization (Axis=channel_axis, momentum=0.1, epsilon=1e-5, gamma_initializer= ' Unifor M ') (x) x = Activation (' Relu ') (x) x = convolution2d (base * k, (3, 3), padding= ' same ', kernel_initializer= ' He_norma L ', Use_bias=false) (x) skip = convolution2d (Base * k, (1, 1), padding= ' same ', strides=strides, Kernel_initializer= ' He_normal ', Use_bias=false) (init) m = Add () ([x, Skip]) return M def Conv1_block (input, K=1, dropout=0.0): init = input Channel_axis = 1 if k.image_data_format () = = "Channels_first"
    Else-1 x = batchnormalization (Axis=channel_axis, momentum=0.1, epsilon=1e-5, gamma_initializer= ' uniform ') (input)
                      x = Activation (' Relu ') (x) x = convolution2d (+ * k, (3, 3), padding= ' same ', kernel_initializer= ' he_normal ', Use_bias=false) (x) if dropout > 0.0:x = Dropout (dropout) (x) x = batchnormalization (axis=channel _axis, momentum=0.1, EPSIlon=1e-5, gamma_initializer= ' uniform ') (x) x = Activation (' Relu ') (x) x = convolution2d (* k, (3, 3), padding= ' SA Me ', kernel_initializer= ' he_normal ', Use_bias=false) (x) m = Add () ([init, X]) return M def

    Conv2_block (input, K=1, dropout=0.0): init = input Channel_axis = 1 if k.image_dim_ordering () = = "th" else-1 x = Batchnormalization (Axis=channel_axis, momentum=0.1, epsilon=1e-5, gamma_initializer= ' uniform ') (input) x = Activa
                      tion (' Relu ') (x) x = convolution2d (* k, (3, 3), padding= ' same ', kernel_initializer= ' he_normal ', Use_bias=false) (x) if dropout > 0.0:x = Dropout (dropout) (x) x = Batchnormalization (Axis=channel_axis, Mome ntum=0.1, epsilon=1e-5, gamma_initializer= ' uniform ') (x) x = Activation (' Relu ') (x) x = convolution2d (+ * k, (3, 3) , padding= ' same ', kernel_initializer= ' he_normal ', Use_bias=false) (x) m = Add () ([init, x]) R Eturn m def conv3_block(Input, K=1, dropout=0.0): init = input Channel_axis = 1 if k.image_dim_ordering () = = "th" else-1 x = Batch Normalization (Axis=channel_axis, momentum=0.1, epsilon=1e-5, gamma_initializer= ' uniform ') (input) x = Activation (' Relu ') (x) x = convolution2d (+ * k, (3, 3), padding= ' same ', kernel_initializer= ' he_normal ', use_ Bias=false) (x) if dropout > 0.0:x = Dropout (dropout) (x) x = Batchnormalization (Axis=channel_axis, momentum=0 .1, epsilon=1e-5, gamma_initializer= ' uniform ') (x) x = Activation (' Relu ') (x) x = convolution2d (+ * k, (3, 3), Padd Ing= ' same ', kernel_initializer= ' he_normal ', Use_bias=false) (x) m = Add () ([init, X]) return M def create_wide_residual_network (Input_dim, nb_classes=100, n=2, K=1, dropout=0.0, verbose=1): "" "Creates a Wi  De residual Network with specified parameters:p Aram input:input Keras object:p Aram Nb_classes:number of output Classes:p Aram N:depthof the network.
              Compute N = (n-4)/6. Example:for a depth of all, n = 28-4, n = (16-4)/6 = 2 example2:for A depth of, n =, n = ()
    /6 = 4 example3:for A depth of, n =, n = (40-4)/6 = 6:p Aram K:width of the network. :p Aram Dropout:adds Dropout if value is greater than 0.0:p Aram Verbose:debug info to describe created Wrn:retu

    RN: "" "Channel_axis = 1 if k.image_data_format () = =" Channels_first "else-1 IP = Input (Shape=input_dim) x = Initial_conv (IP) nb_conv = 4 x = expand_conv (x, th, k) for I in Range (N-1): x = Conv1_block ( X, K, dropout) Nb_conv + = 2 x = batchnormalization (Axis=channel_axis, momentum=0.1, epsilon=1e-5, Gamma_initi alizer= ' uniform ') (x) x = Activation (' Relu ') (x) x = expand_conv (x, Th, K, strides= (2, 2)) for I in range (N- 1): x = Conv2_block (x, K, dropout) Nb_conv + = 2 x = Batchnormalization (AXis=channel_axis, momentum=0.1, epsilon=1e-5, gamma_initializer= ' uniform ') (x) x = Activation (' Relu ') (x) x = Expan

    D_conv (x, 2, K, strides= (2, 2)) for I in Range (N-1): x = Conv3_block (x, K, dropout) Nb_conv + = x = Batchnormalization (Axis=channel_axis, momentum=0.1, epsilon=1e-5, gamma_initializer= ' uniform ') (x) x = Activati On (' Relu ') (x) x = Averagepooling2d ((8, 8)) (x) x = Flatten () (x) x = dense (nb_classes, activation= ' Softmax ') (x 

Model = Model (IP, x) if Verbose:print ("Wide residual network-%d-%d created."% (Nb_conv, k)) return model if __name__ = = "__main__": From Keras.utils import Plot_model from keras.layers import Input from Keras.mode LS import Model init = (+, 3) wrn_28_10 = Create_wide_residual_network (init, nb_classes=10, n=2, k=2, Dropo
 ut=0.0) wrn_28_10.summary () Plot_model (wrn_28_10, "Wrn-16-2.png", Show_shapes=true, Show_layer_names=true)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More