"Reprint" "code-oriented" Learning deep Learning (ii) deep belief Nets (DBNs)

Source: Internet
Author: User
Tags assert nets

Today introduces DBN's content, the key part of which is (Restricted Boltzmann machines, RBM) steps, so first put an RBM structure, to help understand

(The picture is from an explanation ppt of Baidu)

==========================================================================================

As usual, we first look at a complete DBN example program:

This is ex2 in \TESTS\TEST_EXAMPLE_DBN.M.

[CPP]View Plaincopyprint?
  1. Train DBN
  2. dbn.sizes = [100 100];
  3. Opts.numepochs = 1;
  4. opts.batchsize = 100;
  5. opts.momentum = 0;
  6. Opts.alpha = 1;
  7. DBN =dbnsetup (DBN, train_x, opts);  //here!!!
  8. DBN = Dbntrain (DBN, train_x, opts);  //here!!!
  9. Unfold DBN to nn
  10. nn = Dbnunfoldtonn (DBN, 10);  //here!!!
  11. nn.activation_function = ' Sigm ';
  12. Train nn
  13. Opts.numepochs = 1;
  14. opts.batchsize = 100;
  15. nn = Nntrain (nn, train_x, train_y, opts);
  16. [er, bad] = Nntest (NN, test_x, test_y);
  17. ASSERT (Er < 0.10, ' Too big error ');
Train dbndbn.sizes = [100];opts.numepochs =   1;opts.batchsize = 100;opts.momentum  =   0;opts.alpha     =   1;dbn =dbnsetup (DBN, train_x, opts);                here!!! DBN = Dbntrain (DBN, train_x, opts);                here!!! Unfold dbn to nnnn = Dbnunfoldtonn (DBN, ten);                       here!!! nn.activation_function = ' Sigm ';//train nnopts.numepochs =  1;opts.batchsize = 100;nn = Nntrain (NN, train_x, train_y, opts); [er, bad] = Nntest (NN, test_x, test_y); Assert (Er < 0.10, ' Too big error ');

The process is simple and clear, which is Dbnsetup (), Dbntrain (), and Dbnunfoldtonn () three functions

In the end, fine tuning with the Nntrain and nntest seen in (a), see (a)

\dbn\dbnsetup.m

This is really nothing to say,

Direct hierarchical initialization of each layer of RBM (Restricted Boltzmann machines, RBM) Similarly, w,b,c is a parameter, VW,VB,VC is the variable used in the update and momentum, and when the code is seen [CPP]View Plaincopyprint?
  1. For u = 1:numel (dbn.sizes)-1
  2. Dbn.rbm{u}.alpha = Opts.alpha;
  3. Dbn.rbm{u}.momentum = Opts.momentum;
  4. Dbn.rbm{u}. W = Zeros (dbn.sizes (U + 1), dbn.sizes (U));
  5. DBN.RBM{U}.VW = Zeros (dbn.sizes (U + 1), dbn.sizes (U));
  6. dbn.rbm{u}.b = Zeros (dbn.sizes (U), 1);
  7. Dbn.rbm{u}.vb = Zeros (dbn.sizes (U), 1);
  8. dbn.rbm{u}.c = Zeros (dbn.sizes (U + 1), 1);
  9. DBN.RBM{U}.VC = Zeros (dbn.sizes (U + 1), 1);
  10. End
    For u = 1:numel (dbn.sizes)-1        dbn.rbm{u}.alpha    = Opts.alpha;        Dbn.rbm{u}.momentum = opts.momentum;        Dbn.rbm{u}. W  = zeros (dbn.sizes (U + 1), dbn.sizes (U));        DBN.RBM{U}.VW = Zeros (dbn.sizes (U + 1), dbn.sizes (U));        dbn.rbm{u}.b  = zeros (dbn.sizes (U), 1);        Dbn.rbm{u}.vb = Zeros (dbn.sizes (U), 1);        Dbn.rbm{u}.c  = zeros (dbn.sizes (U + 1), 1);        DBN.RBM{U}.VC = Zeros (dbn.sizes (U + 1), 1);    End
\DBN\DBNTRAIN.M should be DBN Basic is to build up the RBM as bricks, so train is also very simple [CPP]View Plaincopyprint?
    1. function dbn = dbntrain (dbn, x, opts)   
    2.     n = numel (DBN.RBM);   
    3.      //to train the RBM of each layer   
    4.     dbn.rbm{1}  = rbmtrain (dbn.rbm{1}, x, opts);   
    5.      FOR&NBSP;I&NBSP;=&NBSP;2&NBSP;:&NBSP;N&NBSP;&NBSP;
    6.         x = rbmup (dbn.rbm{i - 1}, x);   
    7.         dbn.rbm{i} =  Rbmtrain (dbn.rbm{i}, x, opts);    
    8.     end  
    9. END&NBSP;&NBSP;
function DBN = Dbntrain (DBN, X, opts)    n = Numel (DBN.RBM);    Training for each layer of RBM    Dbn.rbm{1} = Rbmtrain (Dbn.rbm{1}, X, opts);    For i = 2:n        x = Rbmup (Dbn.rbm{i-1}, x);        Dbn.rbm{i} = Rbmtrain (Dbn.rbm{i}, X, opts);     EndEnd
The first thing to be greeted is the first layer of the Rbmtrain (), after each layer before train used Rbmup, Rbmup is actually a simple sentence Sigm (Repmat (RBM.C ', size (x, 1), 1) + x * RBM.  W '); That is, the graph above is calculated from V to H, and the formula is Wx+cThe following are the key rbmtrain: \dbn\rbmtrain.m The code reads as follows: "1" Learning deep architectures for AI and "2" A Practical Guide to T Raining Restricted Boltzmann machines You can correspond to this pseudo-code in "1" [CPP]View Plaincopyprint?
  1. For i = 1:opts.numepochs //Iteration Count
  2. KK = Randperm (m);
  3. Err = 0;
  4. For L = 1:numbatches
  5. Batch = X (KK ((L-1) * opts.batchsize + 1:l * opts.batchsize),:);
  6. v1 = Batch;
  7. H1 = Sigmrnd (Repmat (rbm.c', opts.batchsize, 1) + v1 * RBM.            W '); the process of//gibbs sampling
  8. V2 = Sigmrnd (Repmat (RBM.B ', opts.batchsize, 1) + H1 * RBM. W);
  9. H2 = Sigm (Repmat (rbm.c', opts.batchsize, 1) + v2 * RBM.  W ');
  10. the process of//contrastive divergence
  11. //This is the same as the pseudo code that wrote cd-1 in the learning deep architectures for AI
  12. C1 = H1 ' * v1;
  13. C2 = H2 ' * v2;
  14. //For momentum, please refer to Hinton's "A Practical Guide to Training Restricted Boltzmann Machines"
  15. //Its role is to record the previous update direction, and in combination with the current direction, with the possibility of accelerating learning speed
  16. RBM.VW = rbm.momentum * RBM.VW + rbm.alpha * (C1-C2)/opts.batchsize;
  17. Rbm.vb = rbm.momentum * rbm.vb + rbm.alpha * SUM (V1-V2) '/opts.batchsize;
  18. RBM.VC = rbm.momentum * rbm.vc + rbm.alpha * SUM (H1-H2) '/opts.batchsize;
  19. //Update value
  20. Rbm. W = RBM. W + rbm.vw;
  21. RBM.B = Rbm.b + Rbm.vb;
  22. RBM.C = rbm.c + rbm.vc;
  23. Err = err + sum (sum ((v1-v2). ^ 2))/opts.batchsize;
  24. End
  25. End
   For i = 1:OPTS.NUMEPOCHS//number of iterations KK = Randperm (m);        Err = 0;                        For L = 1:numbatches batch = x (KK ((L-1) * opts.batchsize + 1:l * opts.batchsize),:);            v1 = Batch; H1 = Sigmrnd (Repmat (RBM.C ', opts.batchsize, 1) + v1 * RBM.            W '); The process of Gibbs sampling v2 = sigmrnd (Repmat (RBM.B ', opts.batchsize, 1) + H1 * RBM.            W); H2 = Sigm (Repmat (RBM.C ', opts.batchsize, 1) + v2 * RBM.            W ');  Contrastive divergence process//This and "Learning deep architectures for AI" in the cd-1 of the paragraph pseudo code is the same C1 =            H1 ' * v1;            C2 = H2 ' * v2; For Momentum, please refer to Hinton's "A Practical Guide to Training Restricted Boltzmann machines"//Its role is to record the previous update direction, and with the current direction knot                Close, with the possibility of accelerating learning RBM.VW = rbm.momentum * RBM.VW + rbm.alpha * (C1-C2)/opts.batchsize;            Rbm.vb = rbm.momentum * rbm.vb + rbm.alpha * SUM (V1-V2) '/opts.batchsize; RBM.VC = Rbm.momentUm * rbm.vc + rbm.alpha * SUM (H1-H2) '/opts.batchsize; Update value RBM. W = RBM.            W + rbm.vw;            RBM.B = Rbm.b + Rbm.vb;            RBM.C = rbm.c + rbm.vc;        Err = err + sum (sum ((v1-v2). ^ 2))/opts.batchsize; End End
\dbn\dbnunfoldtonn.mafter each layer of DBN, it is natural to pass parameters to a large nn, which is what this function does. [CPP]View Plaincopyprint?
  1. function nn = dbnunfoldtonn (DBN, Outputsize)
  2. %dbnunfoldtonn unfolds a DBN to a NN
  3. % outputsize is your target output label, for example in Minst is 10,DBN is only responsible for learning feature
  4. % or say initialize weight, is a unsupervised learning, the last supervised still depends on NN
  5. if (exist (' outputsize ',' var '))
  6. size = [dbn.sizes outputsize];
  7. Else
  8. size = [dbn.sizes];
  9. End
  10. nn = nnsetup (size);
  11. % the weight of each layer is taken to initialize the weight of the NN
  12. % Note DBN.RBM{I}.C takes to initialize the value of the bias entry
  13. for i = 1:numel (DBN.RBM)
  14. Nn. W{i} = [Dbn.rbm{i}.c dbn.rbm{i}. W];
  15. End
  16. End
function nn = dbnunfoldtonn (DBN, outputsize)%dbnunfoldtonn unfolds a dbn to a NN%   outputsize is your target output label, For example, in Minst is 10,DBN is only responsible for learning feature%   or initializing weight, is a unsupervised learning, the final supervised also rely on NN    if (exist (' Outputsize ', ' var '))        size = [dbn.sizes outputsize];    else        size = [dbn.sizes];    End    NN = nnsetup (size);    % take the weight after each layer to initialize the weight% of the nn    note dbn.rbm{i}.c takes to initialize the value of the bias entry for    i = 1:numel (DBN.RBM)        nn. W{i} = [Dbn.rbm{i}.c dbn.rbm{i}. W];    EndEnd
Finally fine tuning to train the NN to summarizeor that sentence, this article just comb the learning route, specific things or rely on paperDBN The main key is the RBM, recommend a few classic articles, RBM but Hinton baby Ah Which involves to mcmc,contrastive divergence, feel more than autoencoder difficult to understand more [1] an Introduction to Restricted Boltzmann machines [2] Learning deep architectures for AI Bengio masterpiece [3] A Prac Tical Guide to Training Restricted Boltzmann machines mentioned above, more detailed [4] A learning algorithm for BOLTZM Ann Machines Hinton.

"Reprint" "code-oriented" Learning deep Learning (ii) deep belief Nets (DBNs)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.