SSD paper Reading (Wei liu--"ECCV2016" SSD single Shot Multibox Detector)
Directory
- Author
- Reasons for the selection of articles
- Method Summary
- Method details
- Related Background supplement
- Experimental results
- Comparison with related articles
- Summarize
author
reasons for the selection of articles
- Good performance, single stage
Method Summary
- Introduction to the method of the article
- SSD is mainly used to solve the problem of target detection (positioning + classification), that is, the input of an image to be measured, output multiple box location information and category information
- When testing, input an image into the SSD, the network outputs a rightmost tensor (multidimensional matrix), the matrix is non-maximum value suppression (NMS) to obtain the location and label information of each target
The 1th-20th channel for the right-hand side of Figure2 shows the category, each map on the channel corresponds to the original, and each map of the last 4 channel corresponds to the x,y,w,h offset. The last 4 channels can determine the location of a box, and the first 20 channels determine the category information.
- Pipeline and key points of the method
Method Details
- Convolution Filter for prediction
- Calibration of Groundtruth, loss function
- Default box and scale selection
- SSD Training--hard Negative mining
- SSD Training-Data amplification
Related Background supplement
- Atrous algorithm (Hole algorithm)
- Common evaluation criteria for Class II Classification/detection (recall, precision, f-measure, accuracy, error, PR curve and ROC Curve, AP,AUC)
- Evaluation criteria for multi-class classification of Imagenet
- Evaluation criteria of imagenet single target detection
- Evaluation criteria for IMAGENET (multi-) target detection
RealTest Results
- PASCAL VOC2007 Test Detection results
- Using data amplification, multi-scale default box, Atrous algorithm contrast effect
- SSD512 detection Performance visualization on a class of ianimals)
- SSD sensitivity experiment for target size
- Effects of the number of feature maps used by SSDs on the results
comparison with related articles
- Deformation of the original R-cnn method
- Faster R-CNN and SSD comparison
Summary
- Article contribution
- SSD, a single-shot detector for multiple Categories (faster than YOLO, accurate as faster r-cnn)
- the core of SSD is predicting category scores and box offsets< /strong> for a fixed set of default bounding boxes using Span style= "COLOR: #ff0000" >small convolutional filters applied to multiple feature maps From different layers
- experimental Evidence : high accuracy, high speed, simple end-to-end training ( Single shot)
- SSD improved key points for other methods
-
- Using a small convolutional filter to predict object categories and offsets in bounding box locations
- Using separate predictors (filters) for different aspect ratio detections
- Using multiple layers for prediction in different scales (apply these filters to multiple feature maps to perform Detection at multiple stages)
Target detection Method--SSD