purpose of writing : It is found that many machine learning algorithms involve iterative solutions to the maximum value of unconstrained functions, so the following are arranged
The gradient descent method is a common algorithm for solving unconstrained optimization problems, and has the advantages of simple implementation.
Unconstrained optimization problem , because the solution problem is unconstrained, so, we can use a heuristic algorithm: A value of the temptation, in the solution space can always find the most solution, this algorithm actually implies the idea of iteration, but to test this algorithm blindness is relatively large, resulting in a high degree of time complexity, Combining known information: The function rises fastest along the value of the gradient direction function, so the minimum value is required, can we follow the gradient in the opposite direction? That's the idea of a gradient descent algorithm that, as the name implies, is an optimization algorithm that iterates along the gradient descent direction. The form of unconstrained problems to solve is as follows
Since it is an iterative algorithm, the initial iteration point requires us to select, set, iterate, and update the values. The value of the target function is minimized until it converges. Assuming that the K iteration value is, then the expression of the first iteration value of the k+1:
...
Wherein, represents the step size, determined by a one-dimensional search, which makes.
Example
Then this iteration involves a problem when the iteration terminates. Therefore, the iteration termination condition is given: when the gradient is less than one of the positive values we set beforehand, stop iterating--the difference between the values of the function that corresponds to the case in this example is less than one of the positive values we set, the same stops the iteration--corresponds to the case in this example three The difference between the iterative variables evaluated before and after two times is less than one of the positive values we set, stopping the iteration you will find the effect of the iteration step on the iteration, especially the situation three, the original steps to big will really pull the light
Summary : Select the initial value, seek the gradient, with the initial values and its gradient iteration
Note : When the function is a convex function (as in this example), the solution of the gradient descent method is the global optimal solution, in general, the solution cannot guarantee the global optimal solution. And the rate of convergence is not necessarily fast.
Reference : Statistical learning methods, Hangyuan Li