The history and classification of target detection algorithm

Last Update:2018-05-24 Source: Internet

Author: User

Tags svm

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

With the rise of artificial intelligence, target detection algorithm plays a more and more important role in various industries, how to land, this is a very serious topic. Today I saw a Daniel share, study.

To comb the algorithm and history of this field. Facilitate follow-up studies.

According to the time classification, the algorithm can be divided into two kinds: traditional algorithm and CCN algorithm.

Traditional algorithms:

Cascading classifier Frame: haar/lbp/integral hog/acf feature+adaboost

The Cascade classifier was first proposed by Paul Viola and Michael J. Jones in CVPR 2001.

In fact, this is boosting by simple weak classification of the process of assembling strong classifiers, now looks very low, but this algorithm for the first time to make the target detection become a reality!

OPENCV has a classic implementation of cascading classifiers:https://docs.opencv.org/2.4.11/modules/objdetect/doc/cascade_ Classification.html?highlight=haar

As for the use of the features, Haar simple enough, LBP is no need to go to the steak ...

As for HOG/ACF, here's the word.

Hog+svm

Histograms of oriented gradients for human DETECTION,2005,CVPR

Because the original Haar feature is too simple, it is only suitable for rigid object detection and cannot detect non-rigid targets such as pedestrians, so the HOG+SVM structure is proposed.

Also implemented in OpenCV:https://docs.opencv.org/2.4.11/Modules/gpu/doc/object_detection.html?highlight=hog

After another demon changed a series of features such as Log/dog/rog, there is no point to say more.

It is worth mentioning that some people have changed the HoG in SVM to integral HoG for cascading classifiers. This is the prototype of the integral hog of the current OPENCV cascade classifier:

Integral histogram:a Fast Extract histograms in Cartesian Spaces

Follow-up has developed a aggregate Channel Feature (ACF) and other characteristics, the paper is mainly the following 2:

Aggregate Channel Features for Multi-View face DETECTION,2014,IJCB

Fast Feature Pyramids for Object Detection,2014,pami

The bright spot is this fast, speeding up the calculation of the integral hog, the effect is good and fast, still active in the embedded field.

Discriminatively trained deformable part models (DPM)

Project homepage:http://www. Rossgirshick.info/latent/

DPM uses a spring model for target detection, such as. That is, multi-scale + multi-site detection, the underlying image feature extraction is fhog. It is sensation anyway.

The follow-up also has dpm+/dpm++, does not have the meaning not to mention.

Template matching: It is the technology that looks for the most closely matched (i.e. most similar) part of the image in one image. Reference to the relevant implementation: Https://www.cnblogs.com/skyfsm/p/6884253.html

User-aware
Links: https://www.zhihu.com/question/53438706/answer/148973444
Source: Know
Copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please specify the source.

CNN Method:

Based on region proposal (stage): R-cnn family, including faster R-cnn/mask R-CNN/RFCN

However, DPM fire not to 2 years, the R-cnn family appeared, and finally no longer with a variety of magic revision hog features to detect!

One of the most representative of the R-cnn family is faster r-cnn. Faster R-CNN by RPN Network first generated region proposal, and then proposals to the region, is called the second stage.

Faster r-cnn:towards Real-time Object Detection with region proposal Networks

Mask R-CNN

In fact, R-CNN series detection concern him: kaiming He-fair, completely enough.

Based on regression (one-shot): YOLO/YOLO2/SSD/DSSD

Yolo and SSDs are produced proposal while classification+regression, one-time completion, that is, the so-called one-shot. Compared to the speed advantage of the stage, Precision/recall is slightly lower.

SSD:SSD:Single Shot Multibox Detector

As for YOLO, there are currently YOLO V1,yolo 9000 (v2), YOLO v3

Yolo Project homepage (including paper)

In addition, I think, the subsequent version of DSSD and Yolo V2/v3 really no difference between, feel the same.

This also shows that the detection has tended to bottleneck, no algorithm breakthrough is difficult like before, a little increase of dozens of points.

Special text sequence Detection: CTPN (LSTM + r-cnn)/seglink (SSD magic Change)

In addition to the general sense of detection, there is a class of text detection, used for OCR before the text positioning. This kind of detection and general detection is a little bit different. At present 2 kinds of good effect: CTPN and Seglink

Faster r-cnn Inheritance: CTPN horizontal or vertical text detection

Detecting text in Natural Image with connectionist text proposal Network, ECCV, 2016.

Code TIANZHI0549/CTPN

SSD Inheritance: Seglink tilt Text detection

Detecting oriented Text in Natural Images by linking segments,cvpr,2017

Code https://github.com/dengdan/seglink

Of course, the word detection algorithm also has a traditional, such as this OPENCV self-brought:

Real-time Scene Text Localization and recognition, CVPR 2012

But it is not recommended to toss, no need.

Summarize:

The advantage of traditional method is fast, even in the embedded platform can achieve high-speed real-time , the disadvantage is that precision/recall are not very ideal, simply said that the effect is poor;

The advantage of CNN method is that the precision/recall is much better, and the disadvantage corresponds to the slow speed.

At present, in the embedded, the traditional algorithm still has some space, but is squeezed by the mobilenet and so on, in the server side, is completely the deep network world.

Reference Documentation:

1 https://www.zhihu.com/question/53438706

2 Https://zhuanlan.zhihu.com/ML-Algorithm

The history and classification of target detection algorithm

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More