DPM Target detection algorithm (excerpt from graduation thesis)

Source: Internet
Author: User
Tags svm


Spectators, if you find mistakes (there should be a lot ...). ), looking feel free. The training section was not written

Previously written part of the content:

DPM (deformable Parts Model)-Principle (i)

DPM (defomable Parts Model) source Analysis-Detection (II)

DPM (defomable Parts Model) Source Analysis-Training (III)

Recommended reading:

dpm:http://blog.csdn.net/masibuaa/article/category/2267527

Hog:hog (excerpt from graduation thesis)        

  1. DPM Target Detection algorithm

    The DPM algorithm, proposed by Felzenszwalb in 2008, is a component-based detection method that has strong robustness against target deformation [13]. at present, DPM has become the core of many classification, segmentation, Attitude estimation algorithms, Felzenszwalb I was also awarded the "Lifetime Achievement Award" by VOC.

    Using the improved Hog feature, SVM classifier and sliding window (Sliding windows) detection idea, the DPM algorithm adopts multi-component (Component) strategy to target the multi-view problem, and the problem of deformation of the target itself, A component model strategy based on graph structure (pictorial Structure) is adopted. In addition, the model category of the sample, the location of the part model as a latent variable (latent Variable), using multi-sample learning (Multiple-instance learning) to automatically determine.

    This paper briefly introduces the feature extraction, detection model and detection process of DPM.

  2. Characteristics of DPM

    DPM employs the Hog feature and has made some improvements to the hog feature.


    Figure 4.4 DPM after the improved HOG features

    The 4.4,DPM improved Hog feature cancels the block in the original hog and retains only the cell, but when normalized, it directly converts a region of the current cell to its surrounding 4 cells (cell), so the effect is very similar to the original hog feature. The gradient direction of a signed (0-360°) or unsigned (0-180°) can be computed when the gradient direction is computed, some targets are suitable for use of the signed gradient direction, and some are suitable for use with unsigned gradients, and as a common target detection method, DPM differs from the original hog. The strategy of combining signed gradient and unsigned gradient is adopted. Thus, if the feature is directly quantified, then a single unit has a characteristic dimension of up to and too many dimensions. Felzenszwalb extracted a large number of elements of the unsigned gradient, the common dimension of each element, and the principal component analysis (Principal Component ANALYSIS,PCA), found that the use of the first 11 eigenvectors can basically contain all the information, but in order to quickly calculate , the author obtains an approximate PCA dimensionality reduction effect from the visual results of principal components. Specifically, the 36-dimensional vector as a matrix, each row, each column summed to get 13-dimensional features, basically can achieve the Hog feature 36-dimensional detection effect. In order to improve the accuracy of the detection with the signed gradient target, the author sums up the 18 signed gradient directions and obtains the 18-dimensional vector, and finally obtains the dimension eigenvector in Figure 4.4.

  3. DPM's Detection model

    ( a ) ( b ) ( C )

    Figure 4.5 DPM Pedestrian Model

    The target detection model for the DPM V3 version consists of two components, each consisting of a root model and several part models. Figure 4.5 (a) and Figure 4.5 (b) are the effects of the visualization of the root model and the part model of one of the components, each of which is weighted by the SVM classification model coefficients to the gradient direction, and the brighter direction of the gradient direction can be interpreted as the greater the likelihood of pedestrians having this direction gradient. 4.5 (a), the root model is relatively coarse, roughly presenting an upright front/back pedestrian. As shown in 4.5 (b), the part model is a part of a rectangular frame, a total of 6 parts, the resolution is twice times the root model, so as to achieve better results. From this, we can clearly see the head, arm and other parts. To reduce the complexity of the model, both the root model and the component model are axially symmetric. Figure 4.5 (c) is the deviation loss of the part model, the brighter the area indicates the higher the cost of the deviation loss, the deviation loss of the ideal position of the part model is 0.

  4. Detection process for DPM

    DPM uses the traditional sliding window detection method to search by building scale pyramids at various scales. Figure 4.6 is a pedestrian detection process at a certain scale, that is, the pedestrian model matching process [13]. The response score for a location and the root model/part model, which is the inner product of the feature within the Sub-window area that is the anchor point (that is, the upper-left corner coordinate) of the model. The model can also be considered as a filter operator, and the response score is similar to the model to be matched, the more similar the score is. The left side is the root model of the detection process, filtered figure, the brighter the region represents the higher the response score. The right side is the inspection process for each part model. First, the feature image is matched with the model to get the filtered image. Then, the response transformation: with the anchor point as the reference position, the matching degree of the component model and the characteristic, and the deviation loss of the relative ideal position of the part model, the optimal part model position and the response score are obtained.

    Figure 4.6 DPM the detection process of the algorithm

    The formula is in the scale of the layer, thinking of the anchor point detection score. Is the detection score for the root model. Because there are multiple components for the same target, and the detection scores of the different component models need to be aligned, the offset coefficients need to be set. For the response of the first part model, the part model needs to match at the scale level because the resolution of the part model is one of the times of the root model. Therefore, the coordinates of the anchor point also need to be remapped to the scale layer, that is, the part model is offset to the anchor point, so at the scale layer, the ideal position of the part model is.

    The response is a transformation of the formula, where the part model is at the ideal position in the scale layer, the relative offset, and the matching score for the part model at the point. To offset the loss of the score, for the offset loss coefficient, is the model training to learn the parameters when the model is initialized, that is, the offset loss is the relative ideal position of the Euclidean distance.

DPM target detection algorithm (excerpt from graduation thesis)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.