Object Recognition and scene understanding (I) Overview and hog features

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Write a simple topic: Object Recognition and scene understanding, which includes the following three parts:

1. Object Recognition from local scale-invariant features, a feature-based target recognition algorithm. The most representative is the sift feature of David G. Lowe.

The author of this Part has applied for a patent, so I will not introduce it more here.

2. histograms of Oriented gradients for human detection

Pedestrian detection based on Hog features

3. A discriminatively trained, multiscale, deformable part model

DPM has good target detection algorithms so far

Use Network Resources as much as possible based on the above framework, so that you can gather strength and share this part.

Hog features

Http://blog.csdn.net/carson2005/article/details/7782726

The gradient histogram feature (hog) is a type of intensive descriptor for partial overlapping areas of the image. It forms a feature by calculating the gradient direction histogram of the partial area. Hog feature combined with SVM classifier has been widely used in image recognition, especially in pedestrian detection. It should be noted that the hog + SVM method for pedestrian detection was proposed by French researchers Dalal at cvpr 2005. Although many pedestrian detection algorithms are constantly proposed, however, it is basically based on the concept of hog + SVM.

The hog feature is a local region descriptor. It computes the gradient direction histogram on the local region to form the features of the human body, which can well describe the edge of the human body. It is not sensitive to illumination changes and a small amount of offset.

The gradient of the pixel (x, y) in the image is

The process of hog Feature Extraction proposed by Dalal: The sample image is divided into several cell units (cells), and the gradient direction is evenly divided into nine bins ), histogram statistics are performed on the gradient directions of all pixels in each unit to obtain a nine-dimensional feature vector. Each adjacent four units form a block ), combine the feature vectors in a block to obtain a 36-dimensional feature vector, scan the sample image using the block, and scan the step size as a unit. Finally, all the features of the block are connected together to obtain the features of the human body. For example, for 64*128 images, every 2*2 units (16*16 pixels) constitute a block, each with 4*9 = 36 features, taking 8 pixels as the step size, there will be 7 scanning windows in the horizontal direction and 15 scanning windows in the vertical direction. That is to say, 64*128 of images have a total of 36*7*15 = 3780 features.

In addition to the hog feature extraction process mentioned above, the process also includes steps such as Converting color graphs to grayscale and brightness correction. To sum up, the hog feature calculation steps in pedestrian detection are as follows:

(1) convert the input color image into a grayscale image;

(2) The Gamma Correction method is used to standardize the color space (normalization) of the input image. The purpose is to adjust the contrast of the image and reduce the effect of partial shadow and illumination changes, it can also suppress noise interference;

(3) Calculate the gradient, mainly to capture the contour information and further weaken the interference of illumination.

(4) projects the gradient to the gradient direction of the Unit. The purpose is to provide an encoding for the local image area,

(5) normalize all cells in blocks. normalization further compresses illumination, shadows, and edges. Generally, each cell is shared by multiple different blocks, however, its normalization is based on different blocks, so the calculation results are different. Therefore, the features of a cell appear in the final vector multiple times with different results. We call the normalized block descriptor a hog descriptor.

(6) collect the hog features of all blocks in the detection space. This step collects all overlapping blocks in the Detection Window for hog features, and combine them into the final feature vectors for classification.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Object Recognition and scene understanding (I) Overview and hog features

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Object Recognition and scene understanding (I) Overview and hog features

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support