The realization process of hog feature extraction algorithm

Source: Internet
Author: User

In-depth study of hog algorithm principles:
I. Overview of HOG

Histograms of oriented gradients, as the name implies, a directional gradient histogram, is a way of describing a target, both as a descriptive child. Second, Hog put forward
Hog was presented by a doctor in NB in 05, with links to papersHttp://wenku.baidu.com/view/676f2351f01dc281e53af0b2. html
ThreeAdvantages
HOG has many advantages over other feature description methods. First, since the HOG is operated on the local cell of the image, it maintains a good invariance of both the geometric and optical deformations of the image, both of which appear in the larger space domain. Secondly, in the rough airspace sampling, fine direction sampling and strong local optical normalization conditions, as long as the pedestrian generally can maintain upright posture, can allow pedestrians have some subtle limb movements, these subtle actions can be ignored without affecting the detection effect. Therefore, the HOG feature is particularly suitable for human detection in images.


Iv. approximate process

The HOG feature extraction method is to put an image(the Target or scan window you want to detect):

1) grayscale (image as a three-dimensional image of x, Y, Z(grayscale));

2) using Gamma correction method to standardize the color space of the input image (normalized), the aim is to adjust the contrast of the image, to reduce the shadow and illumination changes caused by the image, and to suppress noise interference;

3) calculate the gradient (including size and orientation) of each pixel of the image, mainly to capture contour information, and further weaken the illumination interference.

4) Divide the image into small cells(e.g. 6*6 pixels /cell);

5) The descriptorof each cell can be formed by counting the gradient histogram of each cell (the number of different gradients).

6) Each cell is composed of a block(for example, 3*3 cell/block), a block in which all The HOG feature of the block is descriptorby the cell 's characteristic descriptor in series.

7) The image can be obtained by concatenating the HOG feature descriptor of all blocks within images. The HOG characteristics of (the target you want to detect) are descriptor . This is the final feature vector that can be used for classification.


The detailed procedures for each step are as follows:

(1) standardize gamma space and color space

in order to reduce the influence of illumination factors, the whole image must be normalized (normalized). In the texture intensity of the image, the proportion of local surface exposure contribution is larger, so the compression processing can effectively reduce the shadow and illumination changes in the image. Because the color information does not function very much, it is usually converted to grayscale.

Gamma Compression formula:

For example, can take gamma=1/2;

(2) Calculate image gradient

calculates the gradient of the horizontal and vertical direction of the image, and calculates the gradient direction value of each pixel position accordingly; The derivative operation can not only capture contour, silhouette and some texture information, but also weaken the influence of illumination.

The gradient of the pixel point (x, y) in the image is:

The most commonly used method is: first use the [ -1,0,1] gradient operator to do convolution operations on the original image, the x direction (horizontal direction, to the right in the positive direction) of the gradient component Gradscalx, and then using [1,0,-1] The T -gradient operator makes convolution operation on the original image, and obtains the gradient component gradscalyin the y direction (vertical direction and positive direction). Then use the above formula to calculate the gradient size and direction of the pixel point.

(3) build gradient histogram for each cell unit

The goal of the third step is to provide an encoding for the local image area while maintaining a weak sensitivity to the posture and appearance of the body object in the image.

We divide the image into "cells cell ", such as each cell is 6*6 pixels. Suppose we use a histogram of 9 bin to count this 6*6 the gradient information for pixels. That is, cell gradient Direction 360 degree into 9 directional blocks, for example: if the gradient direction of this pixel is 20-40 degree, the histogram section The count of span style= "Font-family:calibri" >2 bin is added, so that the cell each pixel in a gradient direction in the Histogram weighted projection (mapping to a fixed angle range), you can get this The gradient direction histogram of the cell is the cell >9 dimension eigenvectors (because of 9 bin ).

The pixel gradient direction is used, so what is the gradient size? The gradient size is the weighted value of the projection. For example: the gradient direction of this pixel is 20-40 degree, then its gradient size is 2(assuming AH), then the histogram of the first 2 bin count is not added one, but add two (suppose AH).

Cell cells can be rectangular (rectangular) or star-shaped (radial).

(4) combining cell units into large blocks (blocks), normalized gradient histogram in a block

Due to the change of local illumination and the change of foreground - background contrast, the variation range of gradient intensity is very large. This requires a normalization of the gradient intensity. Normalization can further compress light, shadows, and edges.

The author's approach is to combine each cell unit into a large, spatially connected interval (blocks). Thus, the HOG feature of the block is obtained by concatenating the eigenvectors of all the cells in a block . These intervals overlap, which means that the characteristics of each cell appear multiple times in the final eigenvectors with different results. We will call the Block descriptor (vector) after normalization as the HOG descriptor.

The interval has two main geometrical shapes-rectangular interval (r-hog) and annular interval (c-hog). The R-hog interval is largely a square lattice, which can be characterized by three parameters: the number of cell units in each interval, the number of pixels in each cell, and the number of histogram channels per cell.

For example: the best parameter setting for pedestrian detection is:3x3 cells / interval,6x6 pixels / cell,9 histogram channels. The characteristic number of a piece is:3*3*9;

(5) Collection of HOG features

The final step is to collect all the overlapping blocks in the detection window and combine them into the final eigenvectors for HOG .

The realization process of hog feature extraction algorithm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.