Understanding of Jacobian matrices and hessian matrices

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The calculation of gradient vectors in deep learning, Jacobian matrices and Hessian matrices are fundamental knowledge points.

In fact, the derivative is the linear transformation between linear space, and the Jaocibian matrix is the derivative in essence.

For example, the derivative of the map is the linear mapping between the tangent space at the place and the tangent space at the place. The tangent space is the vector space, all have the base, so this linear transformation is the matrix. In the open set of Euclidean space subspace, the tangent space is one, and the tangent space on the surface is the tangent space on the actual axis. In this case, the derivative of the function is nothing more than a linear transformation of the tangent space to the tangent space, a matrix, isomorphic to a real number. Therefore, the Jacobian matrix is essentially a linear transformation between the base of the tangent space, which is why the transformation of the coordinates in the integral is preceded by a Jacobian matrix determinant.

1. Gradient vector:

Defined:

The objective function f is a single variable and is a function of the argument vector x= (x1,x2,..., xn) t.

The Univariate function f evaluates the vector x to the gradient, and the result is a vector of the same dimension as the vector x, which is called the gradient vector;

2. Jacobian Matrix:

Defined:

The objective function f is a function vector, f= (F1 (x), F2 (x),... FM (x)) T;

wherein, the independent variable x= (x1,x2,..., xn) T;

The function vector f is gradient to X, the result is a matrix, the number of rows is the dimension of F, and the dimension of the number x is called the Jacobian matrix;

Each line is composed of a gradient vector transpose of the corresponding function;

"Note": A special case of the gradient vector Jacobian matrix;

When the target function is a scalar function, the Jacobian matrix is a gradient vector;

The importance of Jacobian matrix is that it embodies the optimal linear approximation of a micro-equation and a given point. Thus, the Jacobian matrix is similar to the derivative of a multivariate function.

an Jacobian determinant

if M = N, then f is a function from n-dimensional space to n-dimensional space, and its Jacobian matrix is a block matrix. So we can take its determinant, called the Jacobian determinant. The Jacobian determinant at a given point provides important information about how it behaves when it approaches that point. For example, if the Jacobian determinant of a continuous micro-function f at P Point is not 0, then it has an inverse function near that point. This is called the inverse function theorem. Further, if the Jacobian determinant of P-point is positive, the orientation of f at P Point is constant, and if it is negative, the orientation of f is reversed. From the absolute value of the Jacobian determinant, it is possible to know the scaling factor of the function f at P Point, which is why it appears in the Exchange-element integration method.

For the orientation problem can be understood, for example, an object uniform motion on the plane, if a positive direction of force F, that is, the same orientation, then the acceleration of motion, analogy to the speed of the derivative acceleration is positive, if the direction of the force F, that is opposite, the deceleration motion, analogy to the speed of the derivative acceleration is negative.
3. Hessian matrix:

In fact, the Hessian matrix is the Jacobian matrix of the gradient vector g (x) to the argument x:

In mathematics, the Haisen matrix (Hessian matrix or Hessian) is a square matrix of second-order partial derivatives of an independent variable as a real-valued function of a vector. The Haisen matrix is applied to the large-scale optimization problem solved by Newton method.

The application of Haisen Matrix in Newton method

Generally speaking, Newton method is mainly used in two aspects, 1, to find the root of the equation; 2, optimized.

1), Solving equations

Not all equations have the root formula, or the root formula is complex, resulting in difficult to solve. The Newton method can be used to solve the problem iteratively.

The principle is to use the Taylor formula, unfold at the x0, and expand to the first order, i.e. F (x) =f (x0) + (x–x0) f′ (x0)

The equation f (x) = 0, i.e. f (x0) + (x–x0) f′ (x0) = 0, solves x=x1=x0–f (x0)/f′ (x0), since this is the first-order expansion of the Taylor formula, F (x) =f (x0) + (x–x0) f′ (x0) is not exactly equal, but approximately equal , the X1 obtained here does not let f (x) = 0, can only say that the value of f (x1) is closer to f (x) = 0 than F (x0), so the idea of iterative solution is natural, can then be introduced xn+1=xn–f (xn)/f′ (xn), through iteration, the formula must be in f (x∗) = Convergence at 0. The whole process is as follows:

2), optimized

In the optimization problem, linear optimization can be solved at least by using simplex method (or fixed point algorithm), but for nonlinear optimization problem, Newton method provides a solution. Assuming that the task is to optimize a target function f, the minimax problem of the function f can be transformed into the problem of solving the derivative f′=0 of the function f, so that the optimization problem can be considered as the equation solving problem (f′=0

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Understanding of Jacobian matrices and hessian matrices

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Understanding of Jacobian matrices and hessian matrices

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support