Deep Learning Notes: A Summary of optimization methods (Bgd,sgd,momentum,adagrad,rmsprop,adam)

Source: Internet
Author: User

Deep Learning Notes (i): Logistic classification
Deep learning Notes (ii): Simple neural network, back propagation algorithm and implementation
Deep Learning Notes (iii): activating functions and loss functions
Deep Learning Notes: A summary of optimization methods
Deep Learning Notes (iv): The concept, structure and code annotation of cyclic neural networks
Deep Learning Notes (v): lstm
Deep Learning Notes (vi): Encoder-decoder model and attention model

Recently looking at Google's deep learning a book, see the Optimization method that part, just before the TensorFlow is also on those optimization methods of a smattering of, so after reading after finishing down, mainly the first-order gradient method, including SGD, momentum, Nesterov Momentum, Adagrad, Rmsprop, Adam. Where Sgd,momentum,nesterov momentum are manually assigned learning rates, and Adagrad, Rmsprop, Adam, can automatically adjust the learning rate.
Second-order methods at present I am too poor to understand .... BGD

namely batch gradient descent. In training, each iteration uses all the content of the training set. That is, using existing parameters to generate an estimated output yi^ for each input in the training set, and then comparing it to the actual output Yi, statistics all the errors, averaging the average error later, as the basis for updating the parameters.

Specific implementation:
Required: Learning rate Ε, initial parameter theta
Iterative process per step:
1. Extract all contents of the training set {x1,..., xn}, and related output yi
2. Calculate gradients and errors and update parameters:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.