Introduction: Yesterday and everyone talked about convolutional neural network, today to bring you a paper: Pca+cnn=pcanet. Now let me take you to understand this article.
Paper:pcanet:A simple deeplearning Baseline for Image classification
Paper Address: https://core.ac.uk/download/pdf/25018742.pdf
Article code: Https://github.com/Ldpe2G/PCANet
1 Summary
This Part I will not say, all in my previous blog said: http://www.cnblogs.com/xiaohuahua108/p/7029128.html is the analysis and analysis of traditional feature extraction method without convolution neural network method to extract features convenient, fast, accurate and high. In fact , it is the use of PCA to learn convolutional core of convolutional neural network, of course, after the two value and hash to reset the pixel point. pcanet feature extraction Flowchart:
2 PCANET Network structure
Suppose we now have n training samples of size MXN, set each level of the filter size to K1XK2.
is a detailed picture of a pcanet:
2.1 First Floor
For example, in the following image, we take a K1XK2 block sample for each pixel, since each pixel is sampled, then MXN, and the first training sample is:, in which, as for why the convolution is so, you take the card such a one along the pixel translation, And that's what you get.
Then there is the essential to center, and then the whole training set is,.
Assuming that the number of filters in layer I is, the purpose of PCA is to look for some standard orthogonal matrices of the columns to minimize the refactoring error:
So is the principal component analysis, that is, the matrix x covariance matrix of the first n eigenvectors, so the corresponding PCA filter is as follows:
2 Second Floor
The second layer is basically the same as the first. First, the mapping output of the first PCA layer is computed:
This is, of course, a 0 operation on the resulting image to maintain the same size as the original image.
Finally, we can get the second-level PCA filter, then the first layer of the image obtained by convolution operation, and then the binary value.
3 output Layer
The second layer obtains the binary hash encoding of the image, and the number of encoded bits is the same as the number of filters in the second layer.
The formula is:
The original image is mapped to, 0-255, here is generally set to 8, function h is a step function. , which indicates the number of filters in the second layer.
For each output matrix of the first layer, it is divided into B block, calculate the histogram information of each block, then cascade the histogram features of each block, and finally get the Block expansion histogram feature:
Overlapping and non-overlapping block patterns can also be used for histogram chunking, depending on the situation. The experimental results show that the non-overlapping blocks are suitable for face recognition, overlapping block patterns are used for handwritten numeral recognition, text recognition, target recognition and so on. In addition, the histogram feature adds some transformation stability to the features extracted by the pcanet (for example, scale does not deform).
4 Experimental results
3 Summary
I think the best point of this article is the use of PCA in the convolution layer, so that the early convolution has a certain purpose, thus improving the classification efficiency. Innovation is very high.
Variants of convolutional neural networks: pcanet