Introduction to feature point matching--freak algorithm

Source: Internet
Author: User
Tags advantage

FREAK algorithm is a ICCV 2012 on the feature point detection and matching of the paper "Freak:fast Retina KeyPoint" proposed, from the article title can be seen that the algorithm is characterized by a rapid, Another feature is that the algorithm is inspired by the principle of human eye recognition objects.
I've seen my blog post before, and I have now introduced SIFT algorithm, ORB algorithm, brief algorithm and brisk algorithm. It can be seen that the brief, orb, and brisk are both the neighboring pixel pairs of the feature points, and the binary string as the descriptor of the feature points, which has the advantage of fast and low memory, has a great advantage in today's mobile computing, but also left some problems. For example, how to determine which pixel pairs are compared in a feature point neighborhood and how to match them. The authors suggest that the trend of optimization is consistent with the complexity of solving complex problems in nature through simple rules. The freak algorithm proposed by the author is done by imitating the human eye vision system, and we introduce the freak algorithm. first, feature point detection

Feature point detection is the first phase of feature point matching, fast algorithm is an algorithm that can detect feature points quickly, and has the accelerated version of Agast algorithm to fast algorithm. The feature point detection method in this paper is the same as the feature point detection method in brisk, it is the establishment of multi-scale images, in different images using fast algorithm to detect feature points, the specific practice see my Blog feature points detection--brisk algorithm introduction, here is not detailed description. two or two binary string feature descriptor

Since the freak algorithm is inspired by the vision system of the human eye, we first look at the vision system of the human eye. 2.1 Human Retina

The authors suggest that in the retinal region of the human eye, the density of the cells that feel light is not the same. The retina of the human eye is divided into four regions according to the density of the photoreceptor cells: Foveola, fovea, Parafoveal and Perifoveal, as shown in the following illustration:

This fovea area is responsible for receiving high-resolution images, while low-resolution images are formed in the Perfoveal region. 2.2 Sampling Mode

In my previous blog post, the sampling point pairs in the feature points neighborhood of the brief and orb algorithms are randomly generated, while the brisk algorithm is the sampling points generated by the average sampling method. The sampling pattern in the freak algorithm is similar to the sampling pattern in the brisk algorithm, but its pattern is closer to the pattern in the human eye vision system, as shown in the following figure:



Each of the black dots in the image above represents a sample point, each of which represents a field of sensation, and the Gaussian blur is processed to reduce the effect of the noise, and the radius of each circle represents the standard deviation of the Gaussian blur. The difference between this sampling pattern and the brisk algorithm is that there is an overlap between each sensation field and the sensing field. Unlike the Orb and brief algorithms, the Gaussian blur radii in the Orb and the brief algorithm are the same, and the Gaussian fuzzy kernel functions of this different size are used here. The authors suggest that it is these differences that lead to better results in the end. By overlapping the field of perception, more information can be obtained, which can make the final descriptor more unique. The different size of the sensation field in the human retina also has such a similar structure.
The sampling structure of the final freak algorithm is 6, 6, 6, 6, 6, 6, 6, 1, where 6 represents 6 sample points in each layer and 6 sample points in a concentric circle, a total of 7 concentric circles, and the last 1 represents the feature point.

2.3 Coarse-to-fine Descriptor

As mentioned above, the freak algorithm is also used to describe the feature points in binary string, where F is represented, then there is F=∑0≤a<n2at (PA) f = \sum_{0 \leq a here T (PA) T (p_a) is a function, Pa p_a is a sample point pair, if I (PR1A) >i (PR2A) I (p_a^{r_1}) > I (p_a^{r_2}), then T (PA) =1 t (p_a) = 1, otherwise, T (PA) =0 t (p_a) = 0. I (PR1A) I (p_a^{r_1}) represents the gray value of the sampled point after Gaussian Blur.
Because there are dozens of sample points in a feature point, there are thousands of candidate sample pairs, and then some of the sample points are not useful for feature descriptions, so you need to filter the feature points, the author uses an algorithm similar to the ORB algorithm to filter the feature point pairs:
(1) The author uses 50,000 feature points to establish a matrix D D, each row in the matrix represents a feature point, which is used to compare the results of all possible feature point pairs established by the above mentioned feature points. If we extract 43 sample points for each feature point, then 43 sample points may be constructed 43*42 = 903 sample Point pairs, then a row in the Matrix D D has 903 columns;
(2) Calculate the average of each column in the Matrix D D, in order to get the characteristics of good uniqueness, the variance should be large, which requires that the average value of the column should be close to 0.5;
(3) According to the variance in each column from large to small to sort;
(4) Take the first k column of the Matrix, the first 512 columns in the paper, because the author found that the first 512 columns are the most relevant, and then increase the number of columns for the result is not very large.
The author divides the obtained 512 sampling points into 4 groups, each of which is a group of 128, to connect these sampling points, the results are as follows:




The authors found that the first group of the four groups were mainly in the periphery, followed by each group of lines gradually inward contraction, the final group of connections mainly in the central part, which is similar to the human eye vision system. The human eye Vision system is also first estimated by the Perifoveal region for the position of the object of interest, and then validated by a more dense fovea region of the photosensitive cell, which ultimately determines the position of the object.

2.4 Saccadic Search

The fovea region of the human eye can capture higher-resolution images because of the relatively dense photoreceptor cells, so the fovea region plays a key role in the identification and matching process. The photosensitive cells in the Perifoveal region are less dense, so they can only capture blurred images, so they are first used to estimate the position of the object. This is the general process of human eye recognition and matching, and the author mimics this process to match the feature points.
First, using the first 128 matching pairs, as a rough information, if the distance is less than a certain threshold, and then use the remaining matching points (precision information) to match. This cascade operation speeds up the match to a large extent, with approximately 90% of the candidate points being excluded from the first 128 match points. A diagram of this cascade operation is shown in the following illustration:
2.5 Orientation

In order to ensure the direction invariance of the algorithm, it is necessary to increase the direction information for each feature point, because the brisk algorithm and the freak algorithm are similar to the sampling points in the neighborhood of the feature points, so the calculation of the characteristic point direction of the freak algorithm is similar to the brisk algorithm. The brisk algorithm calculates the direction of the feature point by calculating the gradient of a sample point pair with a long distance (specifically described in the blog post before I introduced the brisk algorithm), and the freak algorithm calculates its gradient using a symmetric sample of 45 of these distances, as shown in the following figure:

Calculation formula is O=1m∑p

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.