"Pcanet:a Simple Deep Learning Baseline for Image Classification" intensive reading notes

Source: Internet
Author: User
Tags svm

[

This article refers to the blog:

http://blog.csdn.net/orangehdc/article/details/37763933;http://my.oschina.net/Ldpe2G/blog/275922;http:// blog.csdn.net/sheng_ai/article/details/39971599

]

References: [1] Tsung-han Chan, Kui Jia, Shenghua Gao, Jiwen Lu, Zinan Zeng, and Yi Ma, pcanet:a simple Deep Learning-Baseline F or Image classification? 2014

Thesis Link: http://arxiv.org/abs/1404.3606

MATLAB code: MATLAB codes for Download

C + + code: Https://github.com/Ldpe2G/PCANet

Finishing Time: 2014.10.26 One, full text purport

The author thinks that the problem of classical CNN is that the parameter training time is too long and requires special modulation skill.

So they hope to find a network model that is simpler to train and adapts to different tasks and different data types, and this model is pcanet. And the author thinks that the basic algorithm can be used as a benchmark for further study of the depth Learning network.

In short, pcanet is a simplified deep learning model based on CNN. The biggest difference between Pcanet and CNN is that the convolution cores are computed directly from PCA, rather than via feedback iterations like CNN. Ii. Training Method A, feature extraction 1) the stage:

1. Select a k1*k2 window (usually 3*3, 5*5, 7*7) to slide the local feature of the selected picture. After each m*n size picture is extracted from the local feature by Sliding window, it becomes a patch of (m-k1+1) * (n-k2+1) k1*k2 size, "Note: The number of patch in the paper is MN, the code is (m-k1+1) * (n-k2+1), for the convenience of writing, The following are written as "MN", which is written as a k1k2 row, a matrix of the MN column, and each column represents a local feature patch.

2, the above matrix by the column to go to the average, then complete the single picture of the feature extraction operation.

3, to all n pictures to perform the above operation, will feature side by piece, get a new data matrix X, each column contains K1K2 elements, a total of nmn columns.

4, the X matrix to do PCA, take the front L1 a feature vector, as filter.

5. Each column of this L1 feature vector (each column contains K1K2 elements) is rearranged to a patch, which gives the L1 k1*k2 window.

6, then is to each picture, all use this L1 window to do one time convolution.

2) Second Stage:

Step with the initial stage, but the workload becomes the original L1 times. Finally, a set of convolution results of L1 Group (each group of L2) is obtained.

3 Output Layer:

1, the first is the second stage of the results of each convolution to do two value, the original element is greater than 0, then the new matrix corresponding to the position of 1, otherwise 0.

2, each group (altogether L1 Group) has L2 Zhang Yi value picture, will this L2 Zhang Yi value picture multiply by one weight value to add. Weights from small to large in turn corresponding to the filter is also from small to large (1~2l2-1). Red indicates superscript.

3, finally get the value range in the [0,2l2-1] between the L1 picture, and this range of values directly determines the range of histogram statistics behind. 4. Divide each picture into B blocks, for each block histogram statistics, so each block is the length of the vector 2l2, and then the B block connected to become a vector, the size of 2l2b, and then the L1 image corresponding to the vector are connected to the size of 2L2BL1, as the input image characteristics.
B, classifier Training
Finally, all the pictures (a column vector for each picture) are arranged into matrices to train the SVM.
third, the test process directly with the training process to obtain the filter to the convolution, and then the binary, the block histogram statistics, the final column vector into the training of the SVM to classify.
PS. The test also followed the first stage with a absolute rectification layer, resulting in little effect because: the reason could is that the use of quantization plus loc Al histogram (in the output layer) already introduces sufficient invariance and robustness in the final feature. In addition, the author also proposed two variants Randnet and ldanet, the effect is not pcanet good.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.