Why use Theano?
Deep learning is best used in some libraries, such as Theano. This is mainly due to the need for guidance when adjusting parameters in reverse propagation. The chain derivation itself has no difficulty.
But deep learning neural network architecture design is more complex, the number of layers (15 layers is not a dream).
In the three-layer structure of the basic BP network, the length of the chain has reached 5, and the derivation formula has not been looked at directly, the artificial derivation is obviously not wise.
Theano provides a Grad gradient function that automatically evaluates the first derivative of the expression, Grad (Cost,param), where the cost function can be an ultra-long, long expression.
Param can be a large, oversized array or matrix.
Obviously, with the Grad function, we can focus on designing forward-propagating I/O, and reverse-propagating as long as a Grad function is available. The complex derivation of formulas is omitted.
general architecture of Theano
Theano is based on the object-oriented python, so its neural network is also based on object-oriented thinking to write.
Object
It is believed that the middle classifier in the shallow network, each layer in the deep network, is an object.
In this object, you are assigned the input format, you only need to do two things:
Define the parameters and define the output according to the format.
"Data read-in/processing"
Read data from a file and share the data globally (shared)
There is a strange shared type in Theano, and the normal type of Python can be converted by the theano.shared () method.
Like the Connect function in QT or the message response function of MFC, throw this variable into the Theano public area.
Here we have to mention the Theano function mechanism. A large number of inert functions are encapsulated in the theano.tensor.
These lazy functions are not executed in Python. Need to be executed in theano.function ().
Theano.function () has four main districts:
Inputs=[], if it's just a normal list, put the input on this parameter. If you have a lot of input, you should put it in the Givens area. The inputs zone does not support shared variables, so we have to move to the Givens area.
outpus= normal function or lazy function, that is, the specified work function.
updates= parameter update list, format [(Original, new), (original, new) ...], the list expression can be used when the number of deep learning parameter groups is too large.
givens={x:list1[:],y:list2[:],.....}, where x and Y are the names of the variables used in the outputs function, be sure to correspond, and the following will explain why.
Theano.function () is not executed in Python, but is quickly compiled into C code execution, equivalent to each function is a separate subroutine, so the four areas are necessary.
Because it is an independent subroutine, the normal variables in Python are obviously not working well. Therefore, it is generally set to a shared type.
In fact, many of tensor's inert functions require a shared variable in the Python state to be defined. The principle is unknown. For example, T.dot does not require shared variables, but Grad's param must be shared.
Since most of Theano's calculations are in function, and function is executed in C, the Theano has the speed of not losing C and the flexibility of Python.
"Architecture Build &BP Iteration"
Theano Using Tutorials