The CNN of lecun has aroused my great interest. From today on, I will read the papers of lecun and publish the practical results here. 20100419
After reading the generalization and network design strategies thesis, I figured out the derivation of the network structure and BP rules described in section 5. I need to read other books. The Chinese version of "Neural Network Design" I used to read clearly. At that time, I also understood it. These two days. Prepare to implement these five networks. 20100422
The source code of CNN is provided online, so the training speed of mnist is slow. It has been 2 weeks. Not trained yet.
I 've been reading a good book on intelligence over the past few days. Some descriptions about vision are very similar to CNN. 20100625
I have some ideas after reading on intelligence. The tragedy is the sudden power failure. After training for more than two weeks, the network is gone.
However, I have accumulated some experience. I still have a rough understanding of BP.
Let's take a look at the MSE computing problem. Performance Index f (x) = E [ete] is the expectation of mean square error, while f ^ (x) = E (k) Te (k) is used in actual training) the mean square error in the K iteration. This requires a theoretical basis. It seems to be widrow-Hoff.AlgorithmThis alternative is adopted first. The advantage is that you can use the current error and the current input to complete the learning. Is this method applicable to global MSE convergence? In this paper, when the learning speed of the LMS algorithm is small (satisfying a boundary formula), it can be ensured to converge to a solution with a minimum mean square error. For derivation, see the convergence analysis in neural network design p173. The promotion to multi-layer networks is discussed on the p228 page.
The next step is the speed of network initialization and learning. The network initialization value cannot be too large or 0 due to the symmetry of the multi-layer network. Low learning speed can avoid oscillation convergence. Discussed on the page p230.
The Network has c ++ to write the CNN source code, the http://www.codeproject.com/KB/library/NeuralNetRecognition.aspx is very good.
ThisCodeOnly 10 modes can be recognized. 26 modes can be recognized only after a change. The main modification is in mnistdoc. cpp. The definition of the network structure is also in this file. -20100706
The output is changed to 26, and the recognition effect is good, which can reach about 90%.
If the output is changed to 36 (uppercase letters and numbers), the effect will not work. The learning speed factor (ETA) drops to 0.00063, and the false recognition rate is more than 60%. An excessively low ETA may cause overfitting.
2011.4.19
The implementation of CNN is found in opencv2.2. Fortunately, CNN's convolution layer is also very exciting (learning to get some feature maps similar to sparse encoding is more general after all ). Write a simple example to see the effect of CNN. I do not know how much faster the code on codeproject is.