T.grad (Cost, WRT), generally receives two parameters, the first parameter represents a function that needs to be derivative, placed in the background of deep learning is the price function, WRT (with respect to) the parameters of the cost function (popularly speaking, called the argument, F (x) F (x) represents the function f F for x x).
The first parameter of a t.grad must be a scalar.
>>> Import Thenao
>>> import theano.tensor as T
>>> x = T.dmatrix (' x ')
>>& Gt y = x**2+x
>>> gy = T.grad (y, x)
typeerror:cost must be a scalar.
>>> x = T.dmatrix (' x ')
>>> y = t.sum (x**2+x)
# The t.sum here doesn't seem to do sums action
>>> GY = T.grad (y, x)
>>> f = theano.function ([x], GY)
>>> f ([[[[0, 1], [2, 3]])
Array ([[1., 3.],
[5., 7.])
In the same vein, the derivative of the sigmoid type function
DS (x) dx=s (x) (1−s (x)) \frac{d\,s (x)}{dx}=s (x) (1-s (x))
>>> x = T.dmatrix (' x ')
>>> s = t.sum (1./(1.+t.exp (x)))
>>> GS = T.grad (s, x)
>>> dlogistic = Theano.function ([x], GS)
>>> dlogistic ([[[0, 1], [-1,-2]])
Array ([[0.25 , 0.19661193],
[0.19661193, 0.10499359]])
1. Jacobian Matrix
In vector analysis, the Jacobian matrix is a matrix of first-order partial derivatives arranged in a certain way, the determinant of which is called the Jacobian-determinant.
Assuming F:rn→rm f:\, \mathbb{r}^n\rightarrow\mathbb{r}^m is a function that translates from Euclidean n-dimensional space to M-M dimensional Euclidean space (e.g. ymx1=amxnxnx1 y_{m\times 1}=a_{m\times N}x_{n\times 1}), this function consists of M-m real functions, y1 (x1,..., xn),..., ym (x1,..., xn) y_1 (x_1, \ldots,x_n), \ldots,y_m (x_1,\ldots,x_n) ( Just as a vector of M-m dimensions, each entry is an n-n -ary function, and the partial derivative of these functions can form a matrix of MXN m\times N, which is the so-called Jacobian matrix:
⎡⎣⎢⎢⎢⎢⎢⎢⎢∂y