Linear spatial pyramid Matching Using Sparse Coding for Image Classification

Source: Internet
Author: User
Tags svm

Introduction

Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. despite its popularity, these nonlinear SVMs have a complexity in training and O (n) in testing, where N is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images

The cost of Nonlinear SVM is huge.

In this paper we develop an extension of the SPM method, by generalizing Vector Quantization to Sparse Coding followed by multi-scale spatial Max pooling, and propose a linear SPM Kernel Based on sift sparse codes.

 

In recent years the bag-of-features (BOF) model has been extremely popular in image categorization. The method treats an imageAs a collection of unordered appearance descriptors extracted from local patches, quantizes them into discrete "Visual words", and then computes a compact histogram representation for semantic image classification

The method partitions an image into segments in different scales L = 0; 1; 2, computes the BOF histogram within each of21 segments, And finally concatenates all the histograms to form a vector representation of the image. In case where only the scale L = 0 is used, SPM reduces to BOF.

Replacing VQ with sparsecoding

Furthermore, unlike the original SPM thatPerforms spatial pooling by computing histograms, Our approach, called scspm,Uses Max spatial poolingThat is more robust to local spatial translations and more biological plausible

Use Max pooling to replace spatial pooling

After sparse encoding, a linear classifier can achieve good results.

Despite of such a popularity, SPM has to run together with nonlinear kernels, such
AsThe intersection kernel and the chi-square Kernel, In order to achieve a good performance, which requires intensive computation and a large storage.

Cross-core, Chi-square Core

 

Linear SPM using sift sparse Codes

VQ

In the training phase, the base vector V is learned, and the base vector coefficient U is learned in the test phase.

Sparse encoding adds the sparse constraint to the loss function

Same as VQ, the training phase is based on (over-complete) and the test phase is sparse.

Advantages: less reconstruction error; captured image features are prominent; image blocks are said to be sparse Signals

Note: Local Sparse Coding

Therefore, the VQ used for listening voting will cause a large quantitative error, even if the nonlinear SVM is used, the effect is not obvious, and the calculation cost is high.

 

In this work, we defined the pooling function f As a max pooling function on the absolute sparse Codes

It is said that this Max pooling has a biological basis ~~ And more robust

Similar to the Construction of histograms in SPM, we do Max pooling Eq. On a spatial pyramid constructed for an image.


 

Cause analysis:

This success is largely due to three factors: (1) SC has much less quantization errors than VQ; (2) it is well known that image patches are sparse in nature, and thus Sparse Coding is special suitable for image data; (3) the computed statistics by Max pooling are more salient and robust to local translations.

 

Implementation

1, Sparse Coding

Returns the loss function equation of SC. If u is fixed or V is fixed, it is convex, But if both are not fixed, it is non-convex. Therefore, the traditional solution is to fix one problem and solve the other problem. The feature-sign search algorithm proposed by Alibaba Cloud is faster.

Determining that the base V is online can achieve real-time determining of the feature expression Coefficient

2. multi-class linear SVM

Lbfgs

Linear spatial pyramid Matching Using Sparse Coding for Image Classification

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.