Θ should initialize why value
When we are using the logistic regression algorithm, it is possible to initialize θ to 0, but it is not feasible to initialize θ to 0 in a neural network.
If Theta is initialized to 0 consequences-the problem of symmetry ways
When θ is initialized to 0, two weights on the Blue Line are equal, two weights on the red Line are equal, two weights on the Green Line are also equal, so that A1 (2) =a2 (2), likewise δ1 (2) =δ2 (2), The values of θ on the two blue lines are then equal, i.e., after an update, A1 (2) is still equal to A2 (2). Assuming we have a lot of hidden units, the values are equal, so for the output layer, is equivalent to the same feature, will cause a lot of redundancy, caused by the problem called the problem of symmetry ways.
How theta should be initialized-symmetry breaking
Rand (10,11) creates a 10*11 matrix, where each number is a random number between 0 and 1, and this epsilon is not related to the checking we talked about gradient epsilon. Here, with Epsilon, it represents only an initial value that is close to 0.
Summarize
- Break symmetry (symmetric) by initializing θ to a random number matrix that is close to 0 randomly.
- Check to see if the back propagation (which is used to calculate the derivative) is correct by gradient check first, and if correct, initialize θ by gradient descent or advanced algorithmn to find the cost The value of the value of the function to the smallest θ.
Neural Network (13)--Concrete implementation: random initialization