Coordiante Ascent VS Gradient descent

Source: Internet
Author: User

Coordiante Ascent coordinate ascending algorithm

For a detailed explanation, refer to:http://cs229.stanford.edu/notes/cs229-notes3.pdf ">stanford ML

Algorithm Demo

Gradient Descent gradient descent algorithm

For more information, refer to Andrew Ng handouts
Blog Demo

Now talk about the similarities and differences between the two algorithms:

1. Coordinate rise method: the coordinate rise and the coordinate descent can be regarded as a pair, the coordinate rise is used to solve the max optimization problem, the coordinate descent is used for the Min optimization problem, but the two steps are similar and the execution principle is the same.
For example, ask for a max_f (x1,x2,..., xn) problem, where each XI is an independent variable, if the application of the coordinate rising method to solve, its execution steps are:
1. First given an initial point, such as x_0= (x1,x2,..., xn);
2.for dim=1:n
Fixed XI; (where I is a dimension other than dim)
The X_dim of the maximal value of F is obtained by X_dim as the independent variable.
End
3. Loop through step 2 until the value of F does not change or change very little.

Summary: The key point is to transform only one dimension XI at a time, while the other dimensions are fixed with the current value, so loop iteration, and finally get the optimal solution.

2. The coordinate descent method is similar to the above procedure, but when the 2nd step is to find the optimal X_dim value, it becomes the X_dim with the smallest F.

3. The gradient descent method, also known as the steepest descent method, is also the descent method, but the main difference from the coordinate descent method is the selection of a descending direction, in which the direction of descent in the coordinate descent is carried out along the axis of each dimension, that is, the direction is similar to (0,0,1,0,0), (0, 0,0,1,0) (assumed to be 5-D) In this form, and in the gradient descent method, the descent direction is transformed to the gradient direction of the function at the current point, and when the dimension is high, the advantage of the gradient descent is much more obvious than the coordinate descent. One of the starting points of the
gradient descent method is that f is descending the fastest in the opposite direction of the gradient of F. This point, in terms of text, is better understood, is to follow the gradient of the F-direction search forward until the best. If it is described in steps:
1. Given an initial value, such as x_0= (x1,x2,..., xn);
2. The gradient F ' (X_0) at this point is obtained;
3. Determine the location of the next point: X_1 = x_0-a f ' (X_0);(a>0 and generally small, equivalent to the inverse of the gradient of f in a smaller step)
4. Ask F (x_1), if the difference with F (X_0) within a certain range, then stop, otherwise make x_0=x_1, Loop 2,3,4.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.