Today introduces DBN's content, the key part of which is (Restricted Boltzmann machines, RBM) steps, so first put an RBM structure, to help understand
(The picture is from an explanation ppt of Baidu)
==========================================================================================
As usual, we first look at a complete DBN example program:
This is ex2 in \TESTS\TEST_EXAMPLE_DBN.M.
[CPP]View Plaincopyprint?
- Train DBN
- dbn.sizes = [100 100];
- Opts.numepochs = 1;
- opts.batchsize = 100;
- opts.momentum = 0;
- Opts.alpha = 1;
- DBN =dbnsetup (DBN, train_x, opts); //here!!!
- DBN = Dbntrain (DBN, train_x, opts); //here!!!
- Unfold DBN to nn
- nn = Dbnunfoldtonn (DBN, 10); //here!!!
- nn.activation_function = ' Sigm ';
- Train nn
- Opts.numepochs = 1;
- opts.batchsize = 100;
- nn = Nntrain (nn, train_x, train_y, opts);
- [er, bad] = Nntest (NN, test_x, test_y);
- ASSERT (Er < 0.10, ' Too big error ');
Train dbndbn.sizes = [100];opts.numepochs = 1;opts.batchsize = 100;opts.momentum = 0;opts.alpha = 1;dbn =dbnsetup (DBN, train_x, opts); here!!! DBN = Dbntrain (DBN, train_x, opts); here!!! Unfold dbn to nnnn = Dbnunfoldtonn (DBN, ten); here!!! nn.activation_function = ' Sigm ';//train nnopts.numepochs = 1;opts.batchsize = 100;nn = Nntrain (NN, train_x, train_y, opts); [er, bad] = Nntest (NN, test_x, test_y); Assert (Er < 0.10, ' Too big error ');
The process is simple and clear, which is Dbnsetup (), Dbntrain (), and Dbnunfoldtonn () three functions
In the end, fine tuning with the Nntrain and nntest seen in (a), see (a)
\dbn\dbnsetup.m
This is really nothing to say,
Direct hierarchical initialization of each layer of RBM (Restricted Boltzmann machines, RBM) Similarly, w,b,c is a parameter, VW,VB,VC is the variable used in the update and momentum, and when the code is seen
[CPP]View Plaincopyprint?
- For u = 1:numel (dbn.sizes)-1
- Dbn.rbm{u}.alpha = Opts.alpha;
- Dbn.rbm{u}.momentum = Opts.momentum;
- Dbn.rbm{u}. W = Zeros (dbn.sizes (U + 1), dbn.sizes (U));
- DBN.RBM{U}.VW = Zeros (dbn.sizes (U + 1), dbn.sizes (U));
- dbn.rbm{u}.b = Zeros (dbn.sizes (U), 1);
- Dbn.rbm{u}.vb = Zeros (dbn.sizes (U), 1);
- dbn.rbm{u}.c = Zeros (dbn.sizes (U + 1), 1);
- DBN.RBM{U}.VC = Zeros (dbn.sizes (U + 1), 1);
- End
For u = 1:numel (dbn.sizes)-1 dbn.rbm{u}.alpha = Opts.alpha; Dbn.rbm{u}.momentum = opts.momentum; Dbn.rbm{u}. W = zeros (dbn.sizes (U + 1), dbn.sizes (U)); DBN.RBM{U}.VW = Zeros (dbn.sizes (U + 1), dbn.sizes (U)); dbn.rbm{u}.b = zeros (dbn.sizes (U), 1); Dbn.rbm{u}.vb = Zeros (dbn.sizes (U), 1); Dbn.rbm{u}.c = zeros (dbn.sizes (U + 1), 1); DBN.RBM{U}.VC = Zeros (dbn.sizes (U + 1), 1); End
\DBN\DBNTRAIN.M should be DBN Basic is to build up the RBM as bricks, so train is also very simple
[CPP]View Plaincopyprint?
- function dbn = dbntrain (dbn, x, opts)
- n = numel (DBN.RBM);
- //to train the RBM of each layer
- dbn.rbm{1} = rbmtrain (dbn.rbm{1}, x, opts);
- FOR&NBSP;I&NBSP;=&NBSP;2&NBSP;:&NBSP;N&NBSP;&NBSP;
- x = rbmup (dbn.rbm{i - 1}, x);
- dbn.rbm{i} = Rbmtrain (dbn.rbm{i}, x, opts);
- end
- END&NBSP;&NBSP;
function DBN = Dbntrain (DBN, X, opts) n = Numel (DBN.RBM); Training for each layer of RBM Dbn.rbm{1} = Rbmtrain (Dbn.rbm{1}, X, opts); For i = 2:n x = Rbmup (Dbn.rbm{i-1}, x); Dbn.rbm{i} = Rbmtrain (Dbn.rbm{i}, X, opts); EndEnd
The first thing to be greeted is the first layer of the Rbmtrain (), after each layer before train used Rbmup, Rbmup is actually a simple sentence Sigm (Repmat (RBM.C ', size (x, 1), 1) + x * RBM. W '); That is, the graph above is calculated from V to H, and the formula is Wx+cThe following are the key rbmtrain: \dbn\rbmtrain.m The code reads as follows: "1" Learning deep architectures for AI and "2" A Practical Guide to T Raining Restricted Boltzmann machines You can correspond to this pseudo-code in "1"
[CPP]View Plaincopyprint?
- For i = 1:opts.numepochs //Iteration Count
- KK = Randperm (m);
- Err = 0;
- For L = 1:numbatches
- Batch = X (KK ((L-1) * opts.batchsize + 1:l * opts.batchsize),:);
- v1 = Batch;
- H1 = Sigmrnd (Repmat (rbm.c', opts.batchsize, 1) + v1 * RBM. W '); the process of//gibbs sampling
- V2 = Sigmrnd (Repmat (RBM.B ', opts.batchsize, 1) + H1 * RBM. W);
- H2 = Sigm (Repmat (rbm.c', opts.batchsize, 1) + v2 * RBM. W ');
- the process of//contrastive divergence
- //This is the same as the pseudo code that wrote cd-1 in the learning deep architectures for AI
- C1 = H1 ' * v1;
- C2 = H2 ' * v2;
- //For momentum, please refer to Hinton's "A Practical Guide to Training Restricted Boltzmann Machines"
- //Its role is to record the previous update direction, and in combination with the current direction, with the possibility of accelerating learning speed
- RBM.VW = rbm.momentum * RBM.VW + rbm.alpha * (C1-C2)/opts.batchsize;
- Rbm.vb = rbm.momentum * rbm.vb + rbm.alpha * SUM (V1-V2) '/opts.batchsize;
- RBM.VC = rbm.momentum * rbm.vc + rbm.alpha * SUM (H1-H2) '/opts.batchsize;
- //Update value
- Rbm. W = RBM. W + rbm.vw;
- RBM.B = Rbm.b + Rbm.vb;
- RBM.C = rbm.c + rbm.vc;
- Err = err + sum (sum ((v1-v2). ^ 2))/opts.batchsize;
- End
- End
For i = 1:OPTS.NUMEPOCHS//number of iterations KK = Randperm (m); Err = 0; For L = 1:numbatches batch = x (KK ((L-1) * opts.batchsize + 1:l * opts.batchsize),:); v1 = Batch; H1 = Sigmrnd (Repmat (RBM.C ', opts.batchsize, 1) + v1 * RBM. W '); The process of Gibbs sampling v2 = sigmrnd (Repmat (RBM.B ', opts.batchsize, 1) + H1 * RBM. W); H2 = Sigm (Repmat (RBM.C ', opts.batchsize, 1) + v2 * RBM. W '); Contrastive divergence process//This and "Learning deep architectures for AI" in the cd-1 of the paragraph pseudo code is the same C1 = H1 ' * v1; C2 = H2 ' * v2; For Momentum, please refer to Hinton's "A Practical Guide to Training Restricted Boltzmann machines"//Its role is to record the previous update direction, and with the current direction knot Close, with the possibility of accelerating learning RBM.VW = rbm.momentum * RBM.VW + rbm.alpha * (C1-C2)/opts.batchsize; Rbm.vb = rbm.momentum * rbm.vb + rbm.alpha * SUM (V1-V2) '/opts.batchsize; RBM.VC = Rbm.momentUm * rbm.vc + rbm.alpha * SUM (H1-H2) '/opts.batchsize; Update value RBM. W = RBM. W + rbm.vw; RBM.B = Rbm.b + Rbm.vb; RBM.C = rbm.c + rbm.vc; Err = err + sum (sum ((v1-v2). ^ 2))/opts.batchsize; End End
\dbn\dbnunfoldtonn.mafter each layer of DBN, it is natural to pass parameters to a large nn, which is what this function does.
[CPP]View Plaincopyprint?
- function nn = dbnunfoldtonn (DBN, Outputsize)
- %dbnunfoldtonn unfolds a DBN to a NN
- % outputsize is your target output label, for example in Minst is 10,DBN is only responsible for learning feature
- % or say initialize weight, is a unsupervised learning, the last supervised still depends on NN
- if (exist (' outputsize ',' var '))
- size = [dbn.sizes outputsize];
- Else
- size = [dbn.sizes];
- End
- nn = nnsetup (size);
- % the weight of each layer is taken to initialize the weight of the NN
- % Note DBN.RBM{I}.C takes to initialize the value of the bias entry
- for i = 1:numel (DBN.RBM)
- Nn. W{i} = [Dbn.rbm{i}.c dbn.rbm{i}. W];
- End
- End
function nn = dbnunfoldtonn (DBN, outputsize)%dbnunfoldtonn unfolds a dbn to a NN% outputsize is your target output label, For example, in Minst is 10,DBN is only responsible for learning feature% or initializing weight, is a unsupervised learning, the final supervised also rely on NN if (exist (' Outputsize ', ' var ')) size = [dbn.sizes outputsize]; else size = [dbn.sizes]; End NN = nnsetup (size); % take the weight after each layer to initialize the weight% of the nn note dbn.rbm{i}.c takes to initialize the value of the bias entry for i = 1:numel (DBN.RBM) nn. W{i} = [Dbn.rbm{i}.c dbn.rbm{i}. W]; EndEnd
Finally fine tuning to train the NN to summarizeor that sentence, this article just comb the learning route, specific things or rely on paperDBN The main key is the RBM, recommend a few classic articles, RBM but Hinton baby Ah Which involves to mcmc,contrastive divergence, feel more than autoencoder difficult to understand more [1] an Introduction to Restricted Boltzmann machines [2] Learning deep architectures for AI Bengio masterpiece [3] A Prac Tical Guide to Training Restricted Boltzmann machines mentioned above, more detailed [4] A learning algorithm for BOLTZM Ann Machines Hinton.
"Reprint" "code-oriented" Learning deep Learning (ii) deep belief Nets (DBNs)