Description Sub Design Method Introduction _ Local Descriptor

Source: Internet
Author: User
Summary

In the previous article we introduced how the point of interest is to detect and the latest improvement ideas, here we introduce the method of local descriptor design, mainly by extracting the characteristics of the area around the interest point, and then describing the feature as a vector form (that is, the descriptor), and now it appears that the feature extraction process of the area around the interest point, It is generally a method of sampling the local point (region), and then combines the characteristics of the sampling point (region) in the whole area of the interest point. If the Sift method first divides the interest point near the 4*4 small block, then counts each of these 16 blocks 8 direction gradient value, altogether constitutes a 128 bit vector (Sift method's description child design method Too many on-line, no longer detailed introduction). The local descriptor can fully reflect the structure near the point of interest, and it has very good robustness to the target block, so this article mainly introduces the design method of the local descriptor (by the way, the global descriptor mainly includes color, texture, etc.)

Since then, the design of local descriptors is divided into three aspects (or perhaps three improvements). First, choose the location of the sampling points for the area of interest points. Second, extract the sampling point what kind of characteristics. Third, how to select the extracted features, or how to do dimensionality reduction for high-dimensional eigenvector. Characteristics of sampling points

The SIFT uses a gradient histogram bin value of 8 directions in the extraction block area. The surf method is to extract the x,y direction gradient in the region and its absolute value and a total of 4, so its eigenvector is only 64 bits (although SURF-128 is classified according to its directional gradient symbol, each region forms 8 quantities). Another improvement method for the Sift method is the Gloh method, it not only takes into account the direction, introduced the direction of the size of the factors, which will be the direction of gradient coordinates not only divided into 8, and by the size of three rings, divided into 17 regions, so a region of the representation vector has 17 bits (SIFT is 8 bits), The regions in the directional gradient coordinate system are divided as follows:


An improvement of the SURF method is to consider the gradient distortion problem in the Gaussian smoothing process (its edge is blurred), so the G-surf (Gauge-surf) method does not extract the gradient of the x,y direction, and the gradient direction and the vertical direction gradient (respectively refers to the edge direction and the fuzzy direction), Prevents edge information from being lost because of Gaussian smoothing, thus retaining more information, as shown in the following illustration. However, the PM method and some methods of using diffusion function are proposed for the edge blurred problem.

In addition to such methods, in recent years the method for binary descriptors has attracted much attention (such as orb,brisk,freak,brief,ldb,kaze, etc.), and such methods are based on the principle that the image block can be represented by some relative strong contrast points (or blocks). Meaning refers to a large number of pixel (block) pairs of intensity comparison can determine the image block information, the comparison of different position points (1, small to 0) constitutes a set of binary numbers, and the matching process of binary numbers is very fast, which greatly reduces the computational volume.

A pair of pixel comparison results:

Select several pixel pairs to compose the binary descriptor:

However, the LDB method thinks that simply passing the intensity comparison may not be accurate, so it introduces the x,y direction two gradient comparison, so a region block has three-bit comparison, and considering that different scale blocks may have different effects on the image, so it will be considered with different scale blocks.

There is also a way to use genetic or learning methods, do not anticipate what quantity to describe the characteristics, but first set up a different feature description set (including addition and subtraction, various gradient operators), and then through a set of training sets, Using genetic algorithms or other learning methods to find the most descriptive features (such as MO-GP method) that satisfy the repeatability (which can adapt to functions).

Feature selection set for the MO-GP method:

Location of sampling points

The methods used in sift and surf are uniform sampling in the region, but different Gaussian weights are divided according to the distance of each sampling point based on its center point of interest. The Daisy method uses a sampling area similar to the petal distribution, it uses a ray-like method to divide the area of interest points into 8 directions, each of which distributes several layers of sampling points, each of which needs to compute 8 gradient orientations in its subregion, mainly in order to construct scale space, Only need to continuously convolution new Gaussian template on the whole can be.

Sampling feature point selection for Daisy:

Daisy Scale Space Construction process:

Daisy of the description vector:

In recent years, the binary feature descriptor has been proposed because of its choice of a pair of contrasting pixel points (or regions), and a new method is proposed. The earliest brief proposed five most reliable methods for contrasting pixel position selection (such as two-point random selection, Gaussian probability selection, etc.).

Orb is an improved method for brief, which uses the greedy algorithm to find the most representative 256 characteristic pairs (i.e., the correlation between each other is small and the variance is large). Brisk select the corresponding pixel of the way a bit similar to the Daisk method, which consider the Gaussian smoothing after the pixel intensity, but here with the Daisk is different, here the point of Gauss scale does not overlap.

The Freak method utilizes the principle of human retina, divides the area of interest point into the sensory field (circular region) of different size and center, and the corresponding scale Gaussian smoothing in each center, and then uses the Orb similar greedy algorithm to find the most substituted 128 pairs, and finally summarizes four groups of choices, corresponding to the choice from the coarse scale to the fine scale, through this total of 512 pairs of comparisons, can imitate the effect of eye focus, select the rough scale of the 128-bit selection to find the approximate position, and then through the fine scale to determine the position.

With the development of machine learning and so on, it is also a trend to find the location of sampling feature points through genetic algorithm, decision tree and data set training. It divides the feature position selection into four parts, and then determines the optimal selection combination.

If the s-block refers to the location spatial distribution of the sampled feature points, there are mainly four kinds of cases, which are distributed as follows:

dimensionality reduction of a descriptive child

The earliest method is the Pca-sift method, which greatly reduces the dimension of the SIFT descriptor and accelerates the matching process by PCA method, such as Lda,adaboost and other methods. This is for the dimensionality reduction of vectors, and the other is directed at a large number of candidate feature points, for example, the Orb method uses the greedy algorithm to find the most representative 256 characteristic pairs (i.e., the correlation between each other is small, the variance is big or is the highest entropy), this kind of method is to find the most entropy information of the highest feature descriptor.

Descriptive formula describing the maximum (highest entropy) of a child variance:

Invariance of the Descriptor

To make the matching more accurate, you need to make the descriptor invariant, that is, regardless of the image scale, rotation, point of view changes, noise, light intensity changes can make the descriptor can match its corresponding points. Noise invariance and light intensity invariance ratio is better solved, can eliminate the noise influence by the region smoothing or the characteristic point sampling, but the light intensity invariance can achieve through the gradient characteristic and so on, the above method can satisfy these requirements (to some extent).

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.