Neural network Package
"It's still in the draft version, and I'll rearrange it after my whole learning game."
Modular Module
The module defines all the basic methods required to train a neural network and is an abstract class that can be serialized.
Module has two state variables: output and Gradinput
[Output] forward (input)
Use the input object to calculate its corresponding output. Both input and output are usually of type tensor. Sometimes there are exceptions such as table layers (subclasses of tensor). After forward, the output variable should be updated to the new value.
It is not recommended to override this function, and the second is to implement the updateoutput (input) function.
[Gradinput] Backward (input, gradoutput)
In the module for reverse propagation, the default forward propagation has been done and this input should be the same as the forward propagation input. Otherwise the gradient calculation will be faulted.
The same gradinput is also a tensor type, or it may be a table layers.
The reverse propagation process involves calculating two types of gradients.
This function calls the following two functions
Updategradinput (Input,gradoutput)
Accgradparameters (Input,gradoutput,scale).
Again, it's best not to overload the backward function, which is better for overloading the two functions above.
Updateoutput (Input)
This function calculates the output with the current parameter and input, and the result is saved in the output property.
Updategradinput (Input,gradoutput)
To calculate the gradient of input for this module, this application should be in NLP, such as the word vector, which is input but it is also parameter and therefore needs to be updated.
Accgradparameters (Input,gradoutput,scale)
Calculate the gradient of the parameters of the module, if a module does not have parameters, then it does not need to do this step. The names of the state variables corresponding to these parameters are related to the module.
Scale This parameter is a factor to be multiplied before the parameter gradient is integrated.
Type (type [, Tensorcache])
can be used to convert all parameters to the corresponding type types.
Tensors can provided as inputs as well as modules
The module contains the model parameters and the gradient of the parameters
This is very good, lest we take the derivative by ourselves.
Weight and gradweight are its parameter names, which are table types
Parameters () function
If you write the module yourself, overload the function.
[Flatparameters,,flatgradparameters] GetParameters ()
This function returns two tensor
This function cannot be overloaded.
Training ()
This function sets the train property to true. The advantage of this is that for some special modules such as dropout, the process of training and evaluation is not the same.
Evaluate ()
This function sets the train to false.
Findmodules (TypeName)
Containers
This class can easily build complex neural networks.
Container is a base class,
Sequential,parallel,concat inherit it, depthcontact inherit the Concat class.
There are two ways of
Add (module)
Get (Index)
Listmodules ()
List all the modules in the network, including the container module, self, and other components.
Transfer function
including Tanh (), Sigmoid ()
Hardtanh ()
These nonlinear conversion functions are element-wise.
Hardshrink
module = NN. Hardshrink (Lambda)
Equivalent to a threshold function, the book with absolute value greater than Lambda is himself, otherwise 0.
Softshrink
It's almost as smooth as hard.
See Torch official documentation for specific formulas
SoftMax
This doesn't have to be explained.
Softmin
Haha, there's this.
Softplus
Ensure that the output land is full of positive numbers;
Then there is sigmoid and Relu. This is no longer one by one to repeat.
Network Layer Type
1. Simple Type
(Parameters required)
Linear: Do linear transformations
Sparselinear linear transformation of the input data of the sparse
Add a bias to the input data
Mul the input data by the previous factor
CMul component-wise multiplication??
Cdiv except
Euclidean calculate the Euclidean distance from the input data to the K means center store
Weightedeuclidean don't say much, weighted Euclidean distance
(No parameters required, adaptive)
Copy of the input may be a type shift
Narrow Narrow operation on the dimension of development
Replicate similar to the repmat of MATLAB
Reshape
View
Select
Max
Min
Mean
Sum
These four functions are to be manipulated on the dimensions of the development.
Exp,abs,power,square,sqrt, this is element-wise.
Normalize l_p regularization of the input.
MM Matrix Matrix multiplication
Other modules
Batchnormalization standardization of a piece of data
The identity identity function is useful in paralleltable
Dropout
Spatialdropout
Padding in a certain dimension
L1penalty to input plus sparsity
Table Layers
Using table to build more complex network structures
Table is a sub-module of the container module
Concattable
Torch7 Study notes (ii) NN package