Border regression (bounding Box regression) Details __ target detection algorithm

Source: Internet
Author: User
Bounding-box regression

Recently has been looking at detection-related paper, from rcnn, fast rcnn, faster rcnn, YOLO, R-FCN, SSD, to this year's CVPR newest yolo9000. These paper loss functions include a border regression, in addition to rcnn detailed introduction, the other paper are a stroke, or direct reference to rcnn the loss function is written out. The first three online explanations are more, the two later I read a lot of paper to come to these conclusions. Why do I need a border return. What is a border return. Border regression how to do. Why the border return is wide and the coordinates are designed in this form. Why the border regression can only be fine-tuned, in the ground truth near the time to take effect. Why do I need a border return.

Here quoted Wang Bin Senior's understanding, as shown in the following picture:

For the above illustration, the green box represents the ground Truth, and the red box is the region Proposal extracted by selective search. So even if the red box is identified by the classifier as a plane, but because the red box is not positioned (iou<0.5), then this picture is equivalent to not correctly detect the aircraft. If we can fine-tune the red box, so that after fine-tuning the window with ground Truth closer, so it will not be more accurate positioning. Indeed, Bounding-box regression is used to fine-tune the window. What the border returns to.

Continue to borrow brother's understanding: for the window general use four-dimensional vector (x,y,w,h) (x, Y, W, h) to represent, respectively, the center of the window coordinates and the width of the height. For Figure 2, the Red box p represents the original proposal, and the green box G represents the target's Ground Truth, and our goal is to find a relationship that allows the input of the original window p to be mapped to a return window that is closer to the real window G g^ \hat G.


The purpose of the border regression is both: given (PX,PY,PW,PH) (p_x, p_y, P_w, P_h) looking for a mapping F F, which makes F (px,py,pw,ph) = (gx^,gy^,gw^,gh^) f (p_x, p_y, P_w, p_h) = (\h At{g_x}, \hat{g_y}, \hat{g_w}, \hat{g_h}) and (gx^,gy^,gw^,gh^) ≈ (Gx,gy,gw,gh) (\hat{g_x}, \hat{g_y}, \hat{g_w}, \hat{G_h }) \approx (g_x, g_y, g_w, g_h) border regression how to do it.

So what transforms can change from the window P in Figure 2 to a window

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.