Theano Study Notes (4) -- Derivative
The derivative is calculated using T. grad.
Here, pp () is used to print the symbolic expression of the gradient.
The 3rd row output prints the simplified symbol gradient expression through the optimizer, which is much simpler than the 1st output.
Fill (x ** TensorConstant {2}), TensorConstant {1.0}) is used to create an x ** 2 matrix and fill it with 1.
importtheano.tensor as Tfromtheano import ppfromtheano import functionx= T.dscalar('x')y= x ** 2gy= T.grad(y, x)printpp(gy)f= function([x], gy)printf(4)printpp(f.maker.fgraph.outputs[0])>>> ((fill((x** TensorConstant{2}), TensorConstant{1.0}) * TensorConstant{2}) * (x **(TensorConstant{2} - TensorConstant{1})))8.0(TensorConstant{2.0}* x)
The 1st parameters of T. grad must be scalar.
For example, calculate the derivative of the logic function sigmoid:
importtheano.tensor as Tfromtheano import functionx= T.dmatrix('x')s= T.sum(1 / (1 + T.exp(-x)))gs= T.grad(s, x)dlogistic= function([x], gs)printdlogistic([[0, 1], [-1, -2]])>>> [[0.25 0.19661193] [ 0.19661193 0.10499359]]
Calculate the yakebi (jacbian) Matrix
The yake matrix is the first-order partial derivative of a vector:
Use T. arrange to generate a sequence from 0 to y. shape [0. Cyclic computing.
Scan can improve the efficiency of creating a symbol loop.
Lambda ~ Is the built-in magicfunction of python.
x= T.dvector('x')y = x ** 2J, updates = theano.scan(lambdai, y,x : T.grad(y[i], x), sequences=T.arange(y.shape[0]), non_sequences=[y,x])f = function([x], J,updates=updates)f([4, 4])>>> [[ 8. 0.] [ 0. 8.]]
Calculate the Hessian Matrix
Jason matrix is a matrix of second-order partial derivatives of Multivariate functions.
You only need to replace some y of the yabi matrix with T. grad (cost, x.
x= T.dvector('x')y = x** 2cost= y.sum()gy =T.grad(cost, x)H,updates = theano.scan(lambda i, gy,x : T.grad(gy[i], x),sequences=T.arange(gy.shape[0]), non_sequences=[gy, x])f =function([x], H, updates=updates)f([4,4])>>> [[2. 0.] [ 0. 2.]]
Jacby right Multiplication
X can be expanded from a vector to a matrix. Use drop as the right multiplication OPERATOR:
W = T.dmatrix('W')V =T.dmatrix('V')x =T.dvector('x')y =T.dot(x, W)JV =T.Rop(y, W, V)f =theano.function([W, V, x], JV)printf([[1, 1], [1, 1]], [[2, 2], [2, 2]], [0,1])>>> [2. 2.]
Yakby left Multiplication
Use Lop as the left multiplier:
import theanoimport theano.tensor as Tfrom theano import functionx = T.dvector('x')v =T.dvector('v')x =T.dvector('x')y =T.dot(x, W)VJ =T.Lop(y, W, v)f =theano.function([v,x], VJ)print f([2, 2], [0, 1])>>> [[0. 0.] [ 2. 2.]]
Heson matrix multiplied by Vector
You can use Rop.
import theanoimport theano.tensor as Tfrom theano import functionx= T.dvector('x')v= T.dvector('v')y= T.sum(x ** 2)gy= T.grad(y, x)Hv= T.Rop(gy, x, v)f= theano.function([x, v], Hv)printf([4, 4], [2, 2])>>> [4. 4.]