Deep Learning Series (V): A simple deep learning toolkit

Last Update:2018-07-26 Source: Internet

Author: User

Tags rand neural net

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This section mainly introduces a deep learning MATLAB version of the Toolbox, Deeplearntoolbox

The code in the Toolbox is simple and feels more suitable for learning algorithms. There are common network structures, including deep networks (NN), sparse self-coding networks (SAE), CAE, depth belief networks (DBN) (based on Boltzmann RBM implementations), convolutional neural Networks (CNN), and so on. Thanks to the author of the Toolbox. Found that this toolkit was found on the blog of another blogger on Csdn, and the blogger also introduced most of the functions in the Toolkit, clicking on the blogger's blog

And thanks to the blogger. On the basis of this blogger, I add some of my own insights, and the main point is to post some deep learning applications based on the toolkit.

Here's another good Deep learning Toolkit: 18 hottest deep-learning GitHub projects

Well, for the front of the toolbox, you can first look at the above blogger's series of articles, if you do not understand, it's okay, read and look at the application, slowly understand, the following part of the content I will refer to the above bloggers, while adding more detailed explanations in order to achieve a more clear purpose.

According to the content that has been introduced, for DBN,CNN and so on not introduced to the content I will in the post-discussion introduction, here I first mainly introduces under this toolbox's network establishment, as well as the dilution self-coding network.

First, we introduce the model of General network, find in the Toolbox
DEEPLEARNTOOLBOX\TESTS\TEST_EXAMPLE_NN.M file, this test function is to test the general network model, take the previous paragraph of code:

Load mnist_uint8;
train_x = Double (train_x)/255;
test_x  = double (test_x)  /255;
train_y = double (train_y);
Test_y  = double (test_y);

% normalize
[train_x, mu, sigma] = Zscore (train_x);
test_x = Normalize (test_x, Mu, sigma);

Percent EX1 Vanilla Neural net
rand (' state ', 0)
nn = Nnsetup ([784]);
Opts.numepochs =  1;   % number of full  sweeps through data
opts.batchsize = +;  % take  a mean gradient step through this many samples
[NN, L] = Nntrain (NN, train_x, train_y, opts);
[er, bad] = Nntest (NN, test_x, test_y);
ASSERT (Er < 0.08, ' Too big error ');

The use of the handwriting database, the database has been integrated into the toolbox, directly with the good, showing a look at the part of the database, the goal is to train the database to achieve the purpose of recognition:

It is followed by normalization of the database data and so on. Nnsetup set up a network, there will be many parameters initialized, at the same time under the set opts.numepochs = 1; The individual feeling of this parameter is to repeat the test number of all data, set 1 is the experiment once. opts.batchsize = 100; This parameter is to send a large number of samples every random 100 as a wave into the experiment. And then the training test. OK look at Nnsetup:

function nn = nnsetup (architecture)%nnsetup create forward neural network% NN = nnsetup (architecture) returns a neural network structure, architecture for structural parameters% Architec Ture is an n x 1 vector representing the number of neurons in each layer, such as architecture=[784 100 10], indicating that the input layer is a 784-dimensional input, 100 hidden layers, 10 output layer% why is the input 784: Because each handwriting size is 28*28,
    That is, 784 dimensions of the hidden layer Why is 100: casually set, can be arbitrarily modified, need to design the output of why is 10: The handwriting has 0-9 of these 10 results, so for the nn.size = architecture;

    NN.N = Numel (nn.size);   nn.activation_function = ' tanh_opt ';
    % hidden layer activation function: ' Sigm ' (sigmoid) or ' tanh_opt ' (default Tanh).            Nn.learningrate = 2;
    % learning rate: typically needs to is lower when using ' sigm ' activation function and non-normalized inputs.          Nn.momentum = 0.5;            % Momentum weight momentum factor nn.scaling_learningrate = 1;            % Learning Rate change factor (each epoch) nn.weightpenaltyl2 = 0;            % L2 regularization nn.nonsparsitypenalty = 0;         % non-sparse penalty nn.sparsitytarget = 0.05;
  % sparse target value  nn.inputzeromaskedfraction = 0;            % Auto-coded de-noising effect nn.dropoutfraction = 0;            % dropout level (http://www.cs.toronto.edu/~hinton/absps/dropout.pdf) nn.testing = 0; % Internal variable.
    Nntest sets this to one.       Nn.output = ' Sigm '; % output activates output unit ' Sigm ' (=logistic), ' Softmax ' and ' linear ' for i = 2:nn.n% weights and weight moment Um nn.
        W{I-1} = (rand (nn.size (i), Nn.size (i-1) +1)-0.5) * 2 * 4 * SQRT (6/(Nn.size (i) + nn.size (i-1))); Nn.vw{i-1} = zeros (Size (NN.

        W{I-1}));   
    % average activations (for use with sparsity) nn.p{i} = zeros (1, nn.size (i)); End End

This function is very simple to understand, initialize the network, what the network needs to initialize, a lot of initialization is to adapt to all the network (CNN,DBN, etc.), some use it, and now you just need to know the structure of the network, as well as the sparse code representation of the parameters: Nn.nonsparsitypenalty, Nn.sparsitytarget, this is also said in the last section, why sparse means the specific how to do without the tube, the actual use of only a few parameters set, the other to the program bar. Then there is the notice of activation function nn.activation_function. , and then the network weights are randomly initialized.

followed by the Nntrain, about this part, the front said that the blogger introduced very good, there are many comments, you can go to see (Read back OH):
http://blog.csdn.net/dark_scope/article/details/9421061

Here again this function whole: [NN, L] = Nntrain (NN, train_x, train_y, opts);

You can see that Nntrain needs to design the network NN, training data train_x, training the corresponding target value train_y, as well as additional parameters opts. Additional parameters include: Repeat training number OPTS.NUMEPOCHS, training data each block size opts.batchsize and so on. The function comes out is the trained network NN, this is very important, the training of the NN for the structure, which includes all the information you need, such as the weight of each layer of network, training errors, and so can be found, and in the nntest is also used in this well-trained nn. Details of the implementation of Nntrain see the above blog Introduction.

OK then look at Nntest, as follows:

function [ri, right] = Nntest (nn, x, y)
    labels = nnpredict (nn, x);
    [~, expected] = max (y,[],2);
    right = find (labels = = expected);    
    RI = Numel (right)/size (x, 1);
End

Call Nnpredict. The function needs to test the data x and the label y, if there is Y then you can calculate the accuracy rate, if there is no y then you can call labels = nnpredict (NN, x) can get the predicted label.

Ok this is a simple generalization of the neural network, and our third section of MATLAB comes with the function of the Neural Network Toolbox to achieve similar functions. However, complicated with sparse self-coding deep Learning Network, the self-brought is not. In the next section, let's look at the same toolbox. Establish a sparse self-coding network.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More