International - English

Cart Console

Topic Center

Contact Sales

Home > Others

How are RBM trained?

Last Update:2016-03-20 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

RBM (Restricted Boltzman machine, restricted Boltzmann machines) is the basis of deep learning, although the principle is relatively simple, but the actual training used a lot of trick, in the reference, Hinton for us to disclose a few training details.

First, the input is a real value vector:

When the input V of an RBM is a real-valued vector, the formula for calculating the output H of the hidden layer is consistent with the two-value vector, that is, p (h=1|v) = Sigm (b+v*w), note that this formula gives the probability of h=1, we really get the hidden layer output is not this probability, but the two value vector h itself, So we need to do two-valued processing of this probability, H = P (h=1|v) > Randn (h).

After the first round of the hidden layer output H, it is necessary to reconstruct the second round of input V ', because V ' is a real value vector, so the formula used here is V ' = N (c+h*w '), note that here is no longer p (v ' =1|h), but the direct calculation of V ', because the input variable is a real value vector, No two value processing is required.

Finally, we want to calculate the second round of the hidden layer output h ', the formula is P (h ' =1|v ') = Sigm (B+v ' *w), as above, we still need to do two value processing, to get the real h '.

The above is the theoretical calculation process, but in the real calculation, Hinton did the details of processing, mainly in: 1. H is two-valued, 2. Both V ' and H ' are real values. For the 1th, Hinton's explanation is to prevent overfitting, and for the 2nd, Hinton's explanation is to reduce the noise;

Second, the input is a two-value vector:

According to the above analysis it is easy to get, only the formula is listed, p (h=1|v) = Sigm (b+v*w), P (v ' =1|h) = Sigm (C+h*w '), p (h ' =1|v ') = Sigm (B+v ' *w).

Similarly, to prevent overfitting and noise reduction, H is two valued, while V ' and h ' are real values.

Third, about the output:

The output of an RBM is a vector of two values, and the output from the previous layer is the input of the latter layer.

Reference documents:

Krizhevsky A, Hinton G E. Using very deep autoencoders for content-based image Retrieval[c]//esann. 2011.