As a result of the course work, summary of the recent domestic literature on pedestrian detection, although it was written in 2014 and 2013, but the content of the review is still a classic thing. As a tour review.
Xu Teng, Huang, Tian Yong. Survey of pedestrian detection technology in vehicle vision system [J]. Chinese Journal of Image Graphics, 2013,18 (4): 359-367.
In this paper, the most important two links in this technology since 2005 ——— the research status of area segmentation and target recognition are reviewed.
1 ROIs (regions of interest) delimited
There are five main types of Rois separation methods: 1. Based on Motion 2. Based on stereoscopic Vision 3. Based on image feature 4. Based on the Radar 5 rule, as in the table below
2 Target Recognition
This paper summarizes the target recognition method in 2005-2012 years.
Improvement of the original features of the 2.1 feature extraction 2.1.1
The improvement of the original features in the original is mainly about the improvement of the hog, hog see below
http://blog.csdn.net/liulina603/article/details/8291093
Some people will be the size of the block variable, someone using the integration Chart (http://blog.csdn.net/bea_tree/article/details/51106359#t19), someone canceled the cell, someone constructs the image pyramid, Others are using GPUs for accelerated computing.
New features of 2.1.2
The main focus is on the use of local gradients, contours, texture information and a variety of underlying feature combinations.
For example, the coexistence gradient direction histogram feature (Cohog), the two-step directional histogram, the Edgelet, the Adaptive Contour feature (ACF), the Integral channel feature (integral Channel Features), the CSS (color channel self-similar), the central symmetrical pyramid LBP
2.1.3 Use of non-visible spectral data
The main thing is to change the original features into infrared, stereo vision or other non-visual spectra
2.2 Classifier Construction
Mainly on the transformation of SVM and boosting
Maji and others put forward a kind of approximate algorithm of histogram cross kernel (HIK), felzenszwalb and other people use deformation part model (DPM) method to detect human body, automobile and other objects.
Kim and others have proposed the use of multi-classifier boosting algorithm. Lin et boosting framework based on multi-instance Learning (MIL). Babenko uses a multi-attitude learning (MPL) approach to automatically classify training samples according to their posture.
2.3 Search Framework
Sliding windows are popular in the current search framework by combining non-maximum suppression (NMS) or Meanshift methods to get the results. The researchers used the "word Bag" (BOW) model, which is very popular in object classification, to make the global optimal search in the image. At present, this kind of algorithm mainly has the implicit shape model (ISM) and the efficient sub-window search method (ESS).
3 Author Outlook
- Stereoscopic Vision
- Multi-sensor
- New Data Set
- Auto-bind context
Zhangchunfeng, Songgatao, Wang Wanliang. A survey of pedestrian detection technology [J]. Television technology, 2014,38 (3).
This article mainly introduces the methods of pedestrian detection, and divides it into the method based on the global feature, the local feature and the stereo vision based on the whole. In addition, the article summarizes several current database
1 Pedestrian detection method 1.1 based on global characteristics
First, attach a good article about the three characteristics of hog Haar and LBP http://www.open-open.com/lib/view/open1440832074794.html
1.1.1 Haar
Papageorgiou and Poggio first proposed the concept of Harr wavelet; Viola introduced the concept of integral graph, speeded up the extracting speed of Harr feature, and applied it to pedestrian detection, constructed pedestrian detection system with the movement and appearance mode of human body, obtained better detection effect, and laid the foundation for the development of pedestrian detection technology.
1.1.2 Hog
Dalal and Triggs introduced the concept of a gradient-oriented histogram (histogram of oriented Gradients,hog) in 2005 and used it for pedestrian detection, with approximately 9 of Inria pedestrian databases that include changes in viewing angles, lighting and backgrounds. 0% of the test success rate. HOG
is currently the most widely used pedestrian feature descriptor. Zhu and other people put forward the concept of integral histogram, accelerate the calculation of HOG features, Qu and other people put forward the concept of HOG features without background, not only to eliminate the usual situation of background factors on the target HOG features, but also accelerate the speed of HOG feature extraction; Wang and others combine the hog feature with a local two-value mode (locally Binary PATTERN,LBP) for pedestrian detection in partially occluded cases, using a linear support vector machine (Linear supported vectors MACHINE,SVM) as a classifier , 97% of the detection rate is achieved on the INRIA, but its high computational complexity limits real-time applications.
1.1.3 Edgelet
B Wu and others put forward the concept of "small edge" (edgelet) feature, that is, some short line or curve fragment, and apply it to the pedestrian detection of single image in complex scene, and get about 92% detection rate on caviar database. The disadvantage is that each edgelet feature needs to be manually calibrated, so it is time-consuming and laborious, and for some of its own more complex curves, it is difficult to manually calibrate the method to achieve the "small edge" characteristic that fully conforms to the human curve.
1.1.4 Shapelet
In view of the shortcomings of the above-mentioned Edgelet features, Sabzmeydani presented a feature that can be automatically obtained by using machine learning method in 2007, that is, Shapelet feature. The algorithm first extracts the gradient information from the training samples in different directions, then uses the AdaBoost algorithm to train, thus obtains the shapelet characteristic. Yao and so on use the Shapelet characteristic training to obtain a whole body detector, the algorithm is superior to the partial detector based on the haar-like characteristic; They further combine the two detectors together to form a pedestrian detection system, with a 95% pedestrian rate on the Inria data set, which is better than any single detector.
1.1.5 method based on contour template
This approach is mentioned in the first article in this series, which requires a large number of templates and is more cumbersome to tag.
1.1.6 based on the method of motion characteristics
Some of the more representative algorithms include:
1) Viola and other people in view of the situation of the camera to calculate the haar-like characteristics on different images, and then the motion information and image of gray information combined to build a pedestrian detection system. It is suitable for pedestrian detection in low-resolution scenes under severe climatic conditions such as rain and snow, but it is not effective to detect pedestrian occlusion.
2) Dalal for camera motion, we propose to construct a pedestrian detector by combining the appearance-based gradient descriptors and the motion-based differential optical flow descriptors, but this method is only effective for the detection of a single window, and is very poor for the whole image detection effect.
1.2 Based on local features
The basic idea of this kind of method is to divide the human body into several components, then to detect each part of the image separately, finally to integrate the test results according to certain constraint relationship, and finally judge whether there are pedestrians. There are some more effective algorithms: Mohan the human body into the head and shoulders, legs and left arm and right arm 4 parts, and then use the Harr wavelet feature to train the SVM detector. Mikolajczyk the human body into a positive face/head, side face/head, front and back head and shoulders
The head and shoulders of the part, the side, and the legs, then describe each part using SIFT (scale-invariant Feature Transform) features, Vinay D. Shet, a pedestrian detection method based on double-grid logic reasoning is proposed, which divides the human body into three parts of head, upper torso and legs, obtains about 92% detection success rate in USC database, and has reached more than 90% pedestrian detection rate with different degree of occlusion. The advantages of the method are as follows: 1. The influence of the pedestrian detection result is reduced when a part of the human body is obscured; 2) the idea of dividing and dividing the body parts, reducing the difficulty of the whole detection and the geometric constraint relation between the parts also have great help to the accuracy of the final detection.
1.3 Method based on stereo vision
This method refers to the acquisition of images through 2 or more cameras, and then analyzes the three-dimensional information of the target in the image to identify pedestrians. We can use the three-dimensional information to estimate the road surface parameters to filter out the area of interest (ROI), and facilitate the classification of the obtained area, and construct a pedestrian detection system with high detection rate. It can also extract the ROI of multiple images from the left and right angles, and it is used for pattern classification, which reduces the false alarm rate of target detection. It is also possible to construct an upright pedestrian detection system by combining the image luminance information and three-dimensional dense three-dimensional information with the vehicle stereo camera. The advantage of this kind of method is to make full use of the depth information of the target image in the scene to segment the pedestrian area, faster.
2 Database Summary
1) The MIT pedestrian database is an earlier public pedestrian database, too simple.
2) Inria Pedestrian database is currently more static pedestrian database with more realistic scenarios.
3) The image of the Daimer pedestrian database comes from the car camera and the images are grayscale images. The test set is a video of approximately min, which contains complete and partially obscured pedestrians.
4) Caltech Pedestrian database is now a large pedestrian database, the image in the library comes from the car camera, and the real-life image of the actual occlusion frequency consistent, which contains poor quality images.
5) TUD Pedestrian database provides image pair to calculate optical flow information, this database is mainly used to evaluate the function of motion information in pedestrian detection, and is often used in pedestrian detection and tracking research.
6) The NICTA pedestrian database is currently a large, still-image pedestrian database that contains 25 551 single images and 5 207 high-resolution non-pedestrian images, but does not contain motion information and has been divided into training sets and test sets.
7) The ETH pedestrian database is a pedestrian database based on binocular vision, which is obtained by a pair of vehicle cameras, and the calibration information and pedestrian labeling information are given, and the depth information is obtained by means of confidence propagation.
8) The CVC pedestrian database currently contains 3 datasets The database is mainly used for pedestrian detection research in vehicle-assisted driving.
9) Most of the images of the USC pedestrian database are from the surveillance video, which is a relatively small pedestrian database, which is mainly used for pedestrian detection in the presence of occlusion and multi-view situations.
Pedestrian detection Overview (6)