Compared with the gradient descent method, Newton method has a faster convergence rate and second-order convergence in search space, that is to approximate the optimal solution with elliptical surface, and Newton method can be regarded as gradient descent method under two-times surface. Newton method for convex two times optimal problem, iterative one can get the optimal solution.
First, the definition of unconstrained optimal target problem (statistical learning method Appendix B): MINX∈RNF (x) \mathop{min}_{x\in r^n}f (x)
Among them, x∗x^* is the minimum point of the objective function.
Assuming that f (x) has a second-order continuous biasing, if the K-iteration value is x (k) x^{(k)}, at that point the target function Taylor expanded, resulting in: F (x) =f (x (k)) +gtk (X−x (k)) +12 (X−x (k)) H (x (k)) (X−x (k)) f (x) =f ( x^{(k)}) +g_k^t (x-x^{(k)}) +\frac 1{2} (x-x^{(k)}) H (x^{(k)}) (X-x^{(k)}) ⋯\cdots (Type 1)
Where GK G_k represents the gradient value of f (x) at point X (k) x^{(k)}, H (x (k)) H (x^{(k)}) is the sea-race matrix of F (x) x^{(k) in point X (k), which represents the second-order mixed biasing of the function on X, Y. The sea-race matrix is defined as follows: H (x) =[∂2f∂xi∂yi]