Original URL:
http://www.tuicool.com/articles/NbIJ73
http://blog.csdn.net/songzitea/article/details/16986423
Introduction
This section is mainly about David Lowe's elaboration of the SIFT algorithm distinctive Image Features from Scale-invariant keypoints and Herbert Bay, Andreas Ess, Tinne tuyte Laars, Luc Van Gool, explains and summarizes the surf algorithm.
Summary of SIFT feature extraction
According to Lowe's article (more SIFT feature extraction analysis, see the post-scale invariant feature transformation (SIFT) feature extraction analysis).
First of all, the main principles of the SIFT algorithm, the flow of a brief elaboration. Since the core of the algorithm is the extraction of the "scale invariant" feature point, the selected resource of the feature point is obtained from the local extremum point in the differential Gaussian space produced by Gaussian pyramid. First, I think the principle of Gauss Pyramid and Laplace operator can be interpreted as follows: The scale invariant feature point of SIFT algorithm is mostly the point on the edge (further is the corner point on the edge (such as the four vertices of the rectangle), the removal of the non-corner point on the edge is mentioned below), The point on the edge is the point of high frequency (the difference between brightness and adjacency point), and its luminance change rate takes a maximum value, that is, the change rate of luminance change rate (second derivative of luminance) equals 0, that is, Laplace operator I=0. Second, because when the image and Gaussian core convolution, I (x, y) into G (x,y,σ), the second derivative can be converted to the first derivative of the scale σ (Gaussian nucleus is a two-dimensional normal distribution is a solution to the heat conduction equation), and to find that the partial derivative is more easily implemented, so the introduction of σ as a variable Gaussian differential space The Gaussian difference space can be visualized as follows: When the human eye observes an object from far and near, the details of the object will change, but the outline will not change, thus the contour can be determined by comparing the two images standing in the distance and near. In fact, the image in the Gaussian difference space looks like relief, the contour is very obvious, so we can find the extremum point of the Gaussian difference space adjacency image. 】
Secondly, the selection of the extremum points is carried out in two steps, and the feature points are screened out. Using two times surface to simulate the image luminance function near each extremum point, we can find the theoretical extremum point, set a limit, and give the point where the deviation between the actual point and the theoretical point is greater than the limit. Removes points that fall only on edges rather than corners. This type of point that should be removed has a characteristic: The image function is gentle (small curvature) along the tangent direction of the edge, and the vertical edge direction is steep (curvature is large). Since the two eigenvalues of the Hessian matrix are the curvature of the X, y direction, the ratio of two eigenvalues of each extremum point is calculated, and a limit is set to the unqualified points.
Finally, when the feature points are selected, the fabrication of each feature point (simulating the retina of the human eye) is made (because the feature points are too sensitive to the changes in the image, and the descriptors are not sensitive). Since the SIFT is invariant to rotation, a direction is calculated for each feature point, which is a preparation for the production of descriptors. (The implementation is to select a feature point near the area, calculate the area of a bit of gradient size and direction, and then divide these directions into 36 groups, according to the gradient size and normal distribution function weighted, draw 0~360° histogram, and then use two function fitting to find the maximum point of the histogram, Get the total direction of a feature point (if the maximum point is close to the maximal point, divide the feature point into two points at the same coordinates, but use two different directions respectively). After the direction is determined, the gradient direction of all points in the area is turned over the angle of the main direction determined above. Then the region is divided into 4*4=16 Square sub-regions, each region of the point by the gradient size and normal distribution weighted to draw a histogram of 8 directions, so each feature point can be a 128-dimensional description of the sub-vector (because to make the feature point is not sensitive to the contrast changes, but also the vector normalization process). These descriptors can be used for matching, which is to find the point with the shortest distance from the 128-dimensional vector geometry of the feature point in another image. Lowe refers to the method is to use the k-d tree search, but because the high-dimensional situation is time-consuming, we need to add an approximate optimal greedy method pruning and a card-time solution of the strategy to obtain approximate solution. At this point, the SIFT algorithm is complete.
In addition to the theoretical feasibility of the SIFT algorithm discussed above, in the process of SIFT implementation, there is also an important aspect is the selection of some parameters, such as a octave several graphs, a description of the sub-regions, as well as the threshold value of the limit (threshold) selection, These are related to the accuracy of the recognition and the efficiency of the algorithm. Lowe the recommended data in the article, but it may be possible to fine-tune it in specific cases (e.g., different scenarios for recognition).
The bottleneck of time complexity of SIFT algorithm is the establishment and matching of descriptors, how to optimize the description method of feature points is the key to improve sift efficiency, and pca-sift algorithm is optimized here.
Summary of Surt feature extraction
Surf algorithm (more surf feature extraction analysis, see the Detailed Reference blog surf feature extraction analysis ) is the SIFT algorithm improvement, its basic structure, steps and sift similar, but the specific implementation of the process is different. The advantage of the surf algorithm is that it is much faster than sift and has good stability.
First, the surf algorithm of the feature points in the surf algorithm, the criterion of the feature point is a pixel brightness of the Hessian matrix determinant (DXX*DYY-DXY*DXY) as an extremum. Because the calculation of Hessian matrix needs to use the calculation of partial derivative, which is generally obtained by convolution of the pixel luminance value and a certain direction partial derivative of the Gaussian nucleus; in the surf algorithm, in order to improve the speed of the algorithm, in the case of the precision effect is very small, with the approximate box-like filter (0,1,1 composed of box Filter) instead of the Gaussian core. Because the filter has only 0,-1,1, the convolution calculation can be optimized with integral image (Integral image) (O (1) time complexity), which greatly improves the efficiency. Each point needs to calculate DXX,DYY,DXY three values, so three filters are required, and after they are filtered, a response graph of an image (Response image, where the value of each pixel is the DXX*DYY-DXY*DXY of the original image) is obtained. The image is filtered by different size filters to obtain a series of response graphs of the same image at different scales, forming a pyramid (the pyramid does not need to be reduced as Gaussian in the SIFT, i.e. each layer of the pyramid has the same resolution of the image in each group). The detection of feature points is consistent with sift, even if a point whose DXX*DYY-DXY*DXY is greater than the 26 points of its neighborhood (consistent with SIFT), then the point is a feature point. The sub-pixel precise location of feature points is consistent with sift. Secondly, the establishment of the descriptor is to guarantee the rotation invariance of the feature point descriptor, and the main direction is calculated for each feature point. The process for calculating the main direction is as follows: The statistic is centered on the feature point, proportional to a certain bit radius of the feature point scale, the sumx= (Y-direction wavelet transform response) of all pixels in the sector with 60° angle (Gaussian function), sumy= (x-direction wavelet transform response) * (Gaussian function), Calculates the synthetic vector angle Θ=arctan (sumy/sumx), modulo length sqrt (sumy*sumy+sumx*sumx). Rotates the sector counterclockwise (typically taking a step of 0.1 radians), and computes the composition vector in the same way. The maximum value of the synthetic vector modulus of the fan in each direction is obtained, and the corresponding angle is the main direction of the characteristic point. The description is established as follows: Select a square area centered on the feature point and rotate it to align with the main direction. The square is divided into 16 sub-regions of 4x4, and each region is Haar wavelet transform (also using integral image acceleration) to get 4 coefficients. From the above two steps, the generation of 4x4x4=64 vector, that is, the description of the child, it can be used to match the work.
The advantage of this algorithm is that a large number of rational use of integral image to reduce traffic, and in the process of using no reduction of accuracy (wavelet transform, Hessian matrix determinant detection are mature and effective means) . In the time, the surf running speed is about 3 times times of sift, in the quality, the robustness of surf is very good, the feature point recognition rate is higher than sift, in the angle of view, illumination, scale change and so on, generally better than sift.