We will inevitably have a variety of problems in the back propagation, when problems arise, our cost function still decreases with the number of iterations, but there are some problems in the middle, So how do we check to see if our algorithm will be vulnerable to these problems?
Approximate expression of gradients
The above is the approximate expression of the derivative, taking the left side approximation instead of the right side of the unilateral approximation, usually ξ take 10-4, if the acquisition is too small, it will bring a lot of trouble to the calculation.
When θ is a unrolled vector, the approximate value of the derivative of J (θ) to θi is calculated
Approximate expression of derivative using for
Theta is a vector of all the parameters in a neural network, and we calculate the Gradapproxby derivation of all parameters separately.
The Dvec we calculate by the back propagation is the derivative of all parameters, and we judge our back by comparing whether the two numbers of Gradapprox and Dvec are approximately equal . Whether the propagation is correct.
Some issues to be aware of when implementing
After we have checked the back propagation is correct, before we carry out the study, we must turn off the gradient checking. Because we use the back propagation to calculate the derivative more quickly than the numerical gradient algorithmn, so after we verify that it is correct, propagation training Before classifier, we have to turn off gradient checking code .
Summarize
When we implement the back propagation or a complex algorithm, we usually use the numerical gradient to verify that it is correct.
Neural Network (12)--Concrete implementation: How to verify the correctness of back propagation