Stack type Automatic encoder (stacked autoencoder)

Source: Internet
Author: User
Tags theano

Stack type Automatic encoder (stacked Autoencoder) Origin: Automatic encoder

Single automatic encoder, at best, is a hardened patch version of PCA, only one time not enjoyable.

So Bengio and other people in the 2007 greedy layer-wise Training of Deep Networks,

Modeled after the dbn of stacked RBM, stacked autoencoder is put forward, which adds Reggie to the application of non-supervised learning in the deep network.

This has to be referred to as "layered initialization" (Layer-wise pre-training), with the aim of pre-training of unsupervised learning by layers,

To initialize the parameters of the deep network, instead of the traditional random small value method. After the training is completed, the training parameters are used to supervise the learning and training.

Part I principle

Unsupervised Learning Network Training mode and the way to supervise the learning network is the opposite.

In the supervised learning network, the parameter w of each layer is constrained by the error function of the output layer, so the gradient of the Layeri parameter depends on the gradient of layeri+1, and the reverse propagation of "one iteration-update whole network" is formed.

But in unsupervised learning, the parameter W of each encoder is restricted to the input of the current layer, so it can train the Encoderi, transfer the parameters to Layeri, use the advantage parameters to propagate to layeri+1, and then start training.

The new training mode of "All iterations-update single layer" is formed. In this way, the layeri+1 benefit is very high, because it absorbs the essence of the full training dedication of layeri input.

Part II code and implementation

Main reference http://deeplearning.net/tutorial/SdA.html

The stacking machine constructs each layer, Encoder in the constructor function, and saves it.

Theano in the construction of the stacking machine, the easy error point is the encoder, layer of the parameter transfer.

We know that the list of Python has a shallow copy. Theano all shared tagged variables are shallow copies.

So first there is the wrong way to do this:

def __init__ (self,rng,input,n_in,n_out,layersize):      ...      For i in Xrange (Len (layersize)):            ...            Da. W=hidenlayer. W            da.bout=hidenlayer.b

Then you make an error when you grad the outside for DA, suggesting that the params and cost functions do not match.

This is because the tensor expression of the cost function is determined when the cost function is written, when the Da object is just constructed, and thus the da.w in the tensor expression is constructed with random values.

Then, after the DA Construction, the hands of the DA. The memory that W points to has changed (shallow copy is equivalent to reference), so the calculated Grad is not right at all.

In fact, this is reversed, and changed into this

def __init__ (self,rng,input,n_in,n_out,layersize):      ...      For i in Xrange (Len (layersize)):            ...            Hidenlayer. W=da. W            Hidenlayer.b=da.bout

Well, this will not be an error, and each training a encoder, with get_value to see the value of the layer has indeed changed. But, when training encoderi+1, how does feeling have no effect?

In fact, it really does not work, because layeri parameters are not propagated to layeri+1.

Theano uses Python, c dual memory area design, the parameter does not go to Layeri when the Encoderi is trained in C code. But we set up a shallow copy?

The original updates function is in the C memory area and is not aware of the shallow copy relationship, because it is in the Python memory area.

The correct approach is to create a shallow copy relationship in the DA Construction, and when the C code is compiled, all Python objects are reconstructed in the C memory area, which naturally triggers a shallow copy in the C memory area.

Da=da (Rng,layerinput,inputsize,self.layersize[i],hidenlayer. W,HIDENLAYER.B)

Or, after training the encoderi, force the Encoderi parameter into the layeri of the C memory area.

Updatemodel=function (inputs=[],outputs=[],updates=[(...)],updatemodel ()

The style of Theano is approximate to the functional language, and the object and function are all mathematical models. Once constructed, you cannot explicitly assign a value.

Therefore, it is foolish to assign a value to an object in a python non-constructor, and the effect is limited to the Python memory area. However, most of the calculations are in the C memory area, so the updates need to be manually punched into the C memory area.

Updates are a bridge between two areas, and once a shallow copy relationship is found in the Python memory area, the values in the C memory area are updated to the Python memory area. (Useful for saving parameters in Python)

However, it is not automatic to update the Python memory area value to the C memory area. (This must be done carefully)

This practice can be extended to, after the supervision of training, the parameters of the Save and import.

Stack type Automatic encoder (stacked autoencoder)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.