Pytorch Custom Module for learning notes

Source: Internet
Author: User
Tags pytorch theano keras

Pytorch is a python-based deep learning library. Pytorch Source Library of the level of abstraction is small, clear structure, the code is moderate. Compared to very engineered tensorflow,pytorch is an easy-to-start, great deep learning framework.

For the system learning Pytorch, the official provides a very good introductory tutorial, but also provides an example for deep learning, while enthusiastic netizens to share a more concise example. 1. Overview

Different from low-level libraries such as theano,tensorflow, or Keras, sonnet and other high-rise Wrapper,pytorch is a self-made system of deep learning Library (Figure 1).


Figure 1: Comparison of several deep learning library

As shown in Figure 2, pytorch from the lower to the upper layer mainly has three blocks of function.


Figure 2. Pytorch Main Function Module 1.1 tensor Computing engine (tensor computation)

The Tensor compute engine, similar to numpy and Matlab, is a basic object Tensor (an array of ndarray or matlab in the analogy numpy). In addition to providing the implementation of common CPU-based operations, Pytorch also provides an efficient GPU implementation, which is critical for deep learning. 1.2 automatic derivation mechanism (AUTOGRAD)

Since deep learning models are becoming more complex, support for automatic derivation is essential for learning frameworks. Pytorch uses a dynamic derivation mechanism, and a framework using a similar approach includes: Chainer,dynet. In contrast, Theano,tensorflow adopts static automatic derivation mechanism. 1.3 High-level library of Neural Networks (NN)

The Pytorch also provides a high-level neural network module. For common network structure, such as full connection, convolution, RNN and so on. At the same time, Pytorch also provides common objective functions, optimizer and parameter initialization methods.

Here, we focus on how to customize the neural network structure. 2. Custom Module


Figure 3. Pytorch Module
Module is the basic way of pytorch tissue neural network. The module contains the parameters of the model and the calculation logic. function carries the actual functions and defines the calculation logic for forward and back.

Module is the base class for any neural network, and all models in Pytorch must be subclasses of Module. The Module can be nested to form a tree structure. A module can complete nesting by making other module properties.

Note: True to the present (04/2018), Pytorch This part of the interface is not stable, the following explanation has been inconsistent with the latest version, or even incorrect. Before the final stability of the interface, the content is no longer updated, please refer directly to Pytorch's latest source code.

The following is an example of the simplest MLP network structure, which describes how to implement a custom network structure. The complete code can be found in repo. 2.1 Function

Note: To support the derivatives number (i.e. gradient gradient), pytorch 0.2 revenue new definition Function mechanism. If the higher order is not considered, the old method is still work.

Function is the core class of Pytorch automatic derivation mechanism. Function is no parameter or stateless, it is only responsible for receiving input, return the corresponding output; for the reverse, it receives the corresponding gradient of the output, and returns the corresponding gradient of the input.

Here we only focus on how to customize Function. The definition of Function is shown in source code. The following is a simplified code snippet:

Class Function (object):
    def forward (self, *input):
        raise Notimplementederror

    def backward (self, *grad_ Output):
        raise Notimplementederror

both the input and output of the forward and backward are Tensor objects

The Function object is callable, that is, it can be called by means of (). Both the input and output of the call are Variable objects. The following code example implements a ReLU activation function and makes a call:

Import Torch from
Torch.autograd import Function

class Reluf (function):
    def forward (self, input):
        Self.save_for_backward (input)

        output = Input.clamp (min=0)
        return output

    def backward (self, output_grad) :
        Input, = self.saved_tensors

        Input_grad = Output_grad.clone ()
        input_grad[input < 0] = 0
        return Input_grad

# Test
if __name__ = = "__main__": From
      Torch.autograd import Variable

      Torch.manual_ Seed (1111)  
      a = Torch.randn (2, 3)

      va = Variable (A, requires_grad=true)
      vb = Reluf () (VA)
      Print Va.data, Vb.data

      vb.backward (Torch.ones (Va.size ()))
      print Vb.grad.data, va.grad.data

If you need to use forward input in backward, you need to explicitly save the required input in forward. In the code above, forward uses the Self.save_for_backward function to temporarily save the input and use saved_tensors in backward (Python tuple objects) are removed.

Obviously, the input of the forward should correspond to the input of the backward, and the output of the forward should match the input of the backward.

Because function may require the staging of input tensor, it is recommended that you no longer use a function object to avoid the problem of premature memory release. As shown in the sample code, each call to forward regenerates a Reluf object and cannot be called repeatedly in forward when it is initialized. 2.2 Module

Similar to the Function,module object is also callable Yes, the input and output are also Variable. The difference is that the Module is [can] have parameters. The Module contains two main parts: Parameters and Calculation logic (Function calls). Since the ReLU activation function has no parameters, here's an example of how to customize the Module with the most basic fully connected layer.

The operational logic for the fully connected layer defines the following Function:

import torch from Torch.autograd Import function Class Linearf (function): Def forward (self , input, Weight, Bias=none): Self.save_for_backward (input, weight, bias) output = torch.mm (input, weigh

     T.T ()) If bias is not none:output + = Bias.unsqueeze (0). Expand_as (output) return output def backward (self, grad_output): input, weight, bias = self.saved_tensors Grad_input = grad_weight
         = Grad_bias = None if self.needs_input_grad[0]: Grad_input = torch.mm (grad_output, weight) If self.needs_input_grad[1]: Grad_weight = torch.mm (grad_output.t (), input) If bias is not None and
             SELF.NEEDS_INPUT_GRAD[2]: Grad_bias = grad_output.sum (0). Squeeze (0) If bias is not None: Return grad_input, Grad_weight, Grad_bias else:return grad_input, Grad_weight 

The Needs_input_grad is a tuple of type bool with the same length as the forward parameter, which is used to identify whether the input is a computed gradient, and to reduce unnecessary calculations for input without gradients.

The Function (here is Linearf) defines the basic computational logic, which only needs to allocate memory space for the parameter at initialization time and, when evaluated, passes the parameter to the corresponding Function object. The code is as follows:

Import Torch
import torch.nn as nn

class Linear (NN. Module):

    def __init__ (self, in_features, Out_features, bias=true):
         super (Linear, self). __init__ ()
         Self.in_features = in_features
         self.out_features = out_features
         self.weight = nn. Parameter (Torch. Tensor (Out_features, in_features))
         if bias:
             Self.bias = nn. Parameter (Torch. Tensor (out_features))
         else:
            self.register_parameter (' bias ', None)

    def forward (self, input):
         Return Linearf () (Input, self.weight, Self.bias)

It is important to note that the parameter is memory space maintained by the tensor object, but tensor needs to be wrapped as a parameter object. Parameter is a special subclass of Variable, only the difference is Parameter default Requires_grad is True. Varaible is the core class of the automatic derivation mechanism, which is not covered here, see Tutorial. 3. Custom Loop Neural Network (RNN)

We try to define a more complex module--rnn ourselves. Here, we only define the most basic vanilla RNN (Figure 4), the basic calculation formula is as follows:

Ht=relu (w⋅x+u⋅ht−1) H t = r e L U (w⋅x + u⋅h t−1) h_t = Relu (W \cdot x + U \cdot h_{t-1})


Figure 4. RNN "Source"

The implementation of more complex LSTM, GRU, or other variants is very similar. 3.1 Defining the Cell

Import Torch from
torch.nn import Module, Parameter

class Rnncell (Module):
    def __init__ (self, input_size, hidden_size):
        super (Rnncell, self). __init__ ()
        self.input_size = input_size
        self.hidden_size = Hidden_ Size

        Self.weight_ih = Parameter (torch. Tensor (Hidden_size, input_size))
        self.weight_hh = Parameter (torch. Tensor (Hidden_size, hidden_size))
        Self.bias_ih = Parameter (torch. Tensor (hidden_size))
        self.bias_hh = Parameter (torch. Tensor (hidden_size))

        self.reset_parameters ()

    def reset_parameters (self):
        STDV = 1.0/math.sqrt ( self.hidden_size) for
        weight in self.parameters ():
            weight.data.uniform_ (-STDV, STDV)

    def forward ( Self, input, h):
        output = Linearf () (Input, Self.weight_ih, Self.bias_ih) + Linearf () (H, self.weight_hh, Self.bias_ HH)
        output = Reluf () (output)

        return output
3.2 Defining the complete RNN
Import Torch from
torch.nn import Module

class RNN (moudule):
    def __init__ (self, Input_size, hidden_size): C3/>super (RNN, self). __init__ ()
        self.input_size = input_size
        self.hidden_size = hidden_size

        Sef.cell = Rnncell (Input_size, hidden_size)

    def forward (self, inputs, initial_state):
        time_steps = inputs.size (1)

        state = initial_state
        outputs = []
        for T in range (TIME_STEPS): state
            = Self.cell (inputs[:, T,:], State) 
  outputs.append (state)

        return outputs

The complete code to run is shown in repo. discussion

The Module structure of Pytorch is inherited from Torch, which is also referenced by Keras (functional API). In some [early] deep learning frameworks such as Caffe, the network is composed of several layers, which are made up of different topologies. and in (PYT) torch there is no layer and network is the distinction, everything is callable Module. The input and output of the Module's invocation are tensor (encapsulated by the Variable), and the user can construct arbitrarily directed acyclic network structures (dags) very naturally.

At the same time, Pytorch's autograd mechanism encapsulation is relatively shallow, can be relatively easy to customize the reverse transfer or modify the gradient. This is very important for some algorithms.

In summary, Pytorch is a very elegant deep learning framework for custom algorithms only.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.