Classical theory of feature detection and tracking

Source: Internet
Author: User

The classical literature of feature detection and tracking is the article of Shi and Tomasi 1994 (see Reference), although it is very classical. In this paper, a feature extraction method, a tracking algorithm based on image affine transformation model (translation component without deformation component) and a tracking method, which can be used to detect the good feature, are presented. which are bad feature) technology. Feature extraction is designed for optimal tracking and can be monitored for bad real-feature-position according to Dissimilarity (between Predict-feature-position & feature). For deletion. Bad point monitoring according to dissimilarity although not all bad feature points can be detected, changing to a specific feature detection algorithm should result in better tracking.

In this article, it is reviewed and described in the following sections:

1. Two Image motion models

2. Calculate the motion of the image

3. Texture Features

4. Frame Difference dissimilarity

5. Monitoring of feature points (judging by good or bad characteristics)

/********************************* two image motion model--deformation & transformation*********************************/

Image changes between two images (or two frames):

Assuming that two point-in-time t,t+r occur at a point x(x, y) offset of δ(ε,η), δ is a function of position x ,δ=dx+ d, which is an affine transformation, where D is a deformation matrix,

D is the translation translation amount of the center of the feature window. The image coordinates x depends on the center of the window. That is, two frames in image 1 and one point in Image 2 are moved from X to Ax+d. A=E(unit matrix) +D. Given two images I and J, and a feature window in I, the tracking process is to determine the 6 parameters in D and D .

J (Ax + D) = I (x).

The quality of this estimate depends on the size of the feature window, the texture of the image within the window, and the range of camera movements between frames. When this window is very small, d is difficult to estimate because motion changes are difficult to estimate, while small windows are easy to track because it is not easy to cause depth discontinuity. Therefore, if only the translation transformation (that is, δ=D) has a good experimental results, with higher reliability and accuracy.

/********************************* the Motion (model) of the computed image *********************************/

Because the affine transformation calculates a and D are not accurate enough, it will cause error, our aim is to minimize the ξ

Wherein, W is feature window,w (x) is a weight function, the simplest case is under w (x) = 1, or w can be a Gaussian function, the maximum weight in the center of the window. In the case of translation transformations only, a=e; To minimize the above, we will deformation matrix D and translation transformation matrix D for separation, and set the result ξ to 0, and then use the Intercept Kutel expanded linearization results:

That is, the linear equation of the solution 6*6:Tz=a, wherein the deformation matrix D and the translation matrix D are combined, error vector A:


A depends on the difference of two images. The 6*6 matrix T can be obtained from the following formula:


which

It is stated in the article that D is often set to a zero matrix, since D and D can interact with each other through the Matrix V, that is, we can simplify the system to only D:Zd=e

Here, you can write:

Among them, the Z-matrix is also called the Hessian matrix. This allows the translation transformation D to be obtained by Z and E.

/********************************* Texture *********************************/

Most of the feature detection methods in window are made by a pat on the head, not guaranteed to get good results. This article writes that a simple translation of D does not describe affine transformations well, and a good feature can be well tracked.

Two eigenvalues of the characteristic matrix z known to a point λ1,λ2,

1°λ1,λ2 are large-the point is a feature point

2°λ1,λ2 a big small--the point is edge (undirected texture pattern)

3°λ1,λ2 are small-there are no significant changes at this point

So set a λ, when min (λ1,λ2) >λ, the point is the feature point.

difference between/*********************************frame (dissimilarity) *********************************/

Although the differences between frames are small, the cumulative error of 25 frames is very large. In fact, as the number of frames increases, dissimilarity increases greatly, such as:

--: Affine transformation model

------: Pure Translation

° and + represent two sets of samples (two frame sequences)

Some of the mutations in the graph are the cause of the sudden encounter of the feature points in one of the sequences occluded, so that the location of the feature points after the translation transformation is rather inaccurate.

/********************************* feature monitoring (monitoring Features) *********************************/

In order to limit the number of feature points and make each part of the image be used at most one time, in the first frame, the feature window does not overlap (that is, the window size is set).

Figure12 is shown in a picture of the different feature points of the tracking situation, the longitudinal axis represents dissimilarity, you can see feature 58,89 dissimilarity surprisingly high, indicating that the two points of monitoring results are bad feature points. Then look at these two feature windows in different frames of the feature window graph (fig.14): feature.58, not only the translation transformation, the image under different frames has changed, the vertical bar gradually become thicker (the gap between the vertical Edge in the foreground and the letters in the background widens, and it becomes harder to warp the current window into the Window in the first frame (key reason), thereby leading to the rising dissimilarity), so it is not just affine transformations, so the point is monitored as bad feature points, and S Hould be deleted. feature.89.

However, some non-rigid features do not cause dissimilarity, but they are indeed bad feature. As shown, follow feature 24 and 60, both of them tracking dissimilarity have been fluctuating, which is why?

This is described in the original: from the fourth row of fig.14, we see that feature (and similarly feature) contains very small ING, of size comparable to the image's pixel size (the feature window is x pixels). The matching between one frame and the next is haphazard, because the characters in the lettering was badly al Iased. In other words, because the feature points are too small to match well, the error feature points are often detected, so they are also bad feature.

Reference documents:

Shi, J. and Tomasi, C. (1994). Good features to track. INIEEE Computer Society Conference on Computer Vision and Pattern recognition (CVPR ' 94), pages 593–600, IEEE computer Soci Ety, Seattle.

from:http://blog.csdn.net/abcjennifer/article/details/7688710

Classical theory of feature detection and tracking

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.