As we all know, reverse propagation algorithms are difficult to debug and get the correct results, especially when there are many small and imperceptible errors in the execution process. Here we will introduce a method to determine whether the calculation of the number in the Code is correct. Using the guidance verification method described here can help improve the confidence in writing the correct code.
Suppose we want to minimize the number of functions. For this example, assume that in one-dimensional space, an iteration formula for gradient descent is as follows:
Assuming that we have implemented a function to calculate, we can update the parameter when the gradient DROPS:. How can we verify that the function we wrote is correct?
Definition of the recall derivative:
For any value, you can use the following formula to approximate the value:
In practice, set Epsilon to a very small constant, such. (Although Epsilon can obtain extremely small values, for example, this will lead to a number rounding error), it is usually enough.
Now, given the function calculated in the hypothesis, we can check whether the function is correct in the following way:
To what extent are these two values correct? Depends on the specific shape. However, we usually find that at least four valid numbers are the same (or even more) in the values on the left and right sides of the formula ).
Now, consider that the parameter is a vector rather than a real number (so we need to learn a parameter), and. we use symbols in our neural networks, so we can imagine that we can put all these parameters into a very long vector. now, the derivative test process is generalized to the vector.
Suppose we have compiled a function to calculate the derivative, and we want to check whether the derivative value is calculated correctly:
Is the first base vector (the dimension is the same, only the first element is 1, and all other position elements are 0 ). therefore, apart from adding Epsilon to the first element, the other elements are identical. Similarly, we can check whether the following formula is correct or not to verify the correctness:
When a neural network is trained using reverse propagation, the correct algorithm can be obtained:
This is shown in
Automatic sparse Encoding Based back propagation algorithm (BP)
Gradient Descent in pseudo code. Generally, the derivative value calculated using the above method is used to test whether or not the values in our program are actually calculated.
Learning Source: http://deeplearning.stanford.edu/wiki/index.php/Gradient_checking_and_advanced_optimization
Gradient test of automatic sparse Encoding