First, we look at the new progress of target detection from CVPR2016. The 2016 CVPR conference target detection method is mainly based on convolution neural network framework, Representative work has resnet (in faster r-cnn ResNet replacement Vgg), YOLO (regression detection framework), locnet (more accurate positioning), Hypernet (High level information of neural network is advantageous to the identification, the bottom characteristic is advantageous to the localization, the low layer characteristic fusion), ION (adds the context information based on the Fast r-cnn), G-CNN (reduces the latent box).
Basically, in the framework of faster R-CNN to make improvements, YOLO in the framework of innovation. Faster R-CNN as an important method for the application of depth learning in target detection, the 5 papers generally reflect the development of depth learning algorithm in target detection field since 2013, when the r-cnn of RGB was introduced by the Great God (depth learning).
R-cnn-->sppnet-->fast R-cnn-->faster R-cnn-->yolo
The following is a simple way to sort out the background and solve the problems of each algorithm:
1, target detection progress slowly, CNN in the image classification has achieved great success.
Put forward R-CNN: Transform the detection problem into the classification problem, apply CNN.
Selective Search (SS) Extracts region proposal, CNN extracts the characteristics of each region, SVM classification, bounding box (BB) regression.
2, CNN requires the input picture size fixed, the proposal feature map calculation is not shared.
Put forward: sppnet, introduce the SPP layer to release the fixed dimension constraint.
3, too many candidate location areas to be processed, these candidates are only rough positioning.
In this paper, Fast R-CNN is combined with a single step training algorithm, which combines the target candidate area of learning classification and improving their spatial location. (ROI pool layer, BB regression into the network, directly using CNN extracted features, combined into a multitasking model)
4, with the detection time drops, region proposal calculation becomes the bottleneck
It is proposed that: faster rcnn, using RPN, and the detection network share the full Tou product feature, which makes region proposal approximate no consumption.
5, the previous proposed algorithm is to transform the detection problem into a classification solution.
Put forward: YOLO. The detection regression to the regression method to improve real-time performance.
The above is a simple comb for the development of 5 papers, basically is the last paper in the previous article on the existing problems, put forward a solution to some of the content to make improvements, thereby improving performance. Follow-up to these 5 papers for concrete summary.
Thesis Connection:
R-cnn:https://arxiv.org/pdf/1311.2524v5.pdf
Sppnet:https://arxiv.org/pdf/1406.4729v4.pdf
Fast r-cnn:https://arxiv.org/abs/1504.08083
Faster r-cnn:https://arxiv.org/abs/1506.01497
yolo:https://arxiv.org/abs/1506.02640
Reference: http://sanwen8.cn/p/291742Q.html