Sift parsing (a) establishing a Gaussian pyramid

Source: Internet
Author: User

Transferred from:Honpey http://blog.csdn.net/wendy709468104/article/details/8639617

SIFT (scale-invariant Feature Transform, scale invariant feature conversion) in the field of target recognition, image registration has a wide range of applications, the following in accordance with the SIFT characteristics of the algorithm flow to its brief introduction to SIFT features.

Gaussian pyramid is the first step of SIFT feature extraction, and then the determination of extreme points in feature space is based on Gaussian pyramid, so the first step of SIFT feature learning is how to establish the Gaussian pyramid.

Understand a few definitions:

Gaussian pyramid for the Gaussian pyramid, it is easy to intuitively understand the same size of the image, and then to different degrees of Gaussian smoothing, these images constitute the Gaussian pyramid, this is not correct, this description of the image set is called a eight degree. The pyramid has to have a "sharp" process, the real Gaussian pyramid to have a smooth and down-sampling process, so the entire image smoothing and the next sample smoothing, the composition of all the image set to constitute the image of the Gaussian pyramid.

eight degrees (octave) simply said that Octave is a set of images that are blurred by different Gaussian nuclei under a specific size (length-width). The set of octaves is the Gaussian pyramid.

Why build a Gaussian pyramid:

The entire Gaussian pyramid, or differential Gaussian pyramid, is the basis for our determination of the SIFT feature, so let's start by thinking about what the Gauss Pyramid is doing and what he is imitating. The answer is easy to determine, the Gaussian pyramid imitates the different scales of the image, how should the scale be understood? For an image, you close to the image, and you in a meter outside the observation, see the image effect is different, the former is relatively clear, the latter is more ambiguous, the former is larger, the latter is smaller, through the former can see some details of the image, through the latter can see the image of some outline information, this is the image of the scale, The scale of the image is natural, not artificially created. Well, here we understand, in fact, before the processing of an image is relatively monotonous, because our focus on only two-dimensional space, and does not take into account the "depth of the image" such a concept, if we take these things into account we will not get more information previously in the two-dimensional space did not get? So the Gauss Pyramid was born, it is to on the basis of two-dimensional image, extract the image of the natural existence of another dimension: scale. Because the Gaussian nucleus is the only linear nucleus, which means that the Gaussian check image Blur does not introduce other noises, the Gaussian kernel is used to construct the image scale.

Two images are typical images of Gaussian pyramids, which are images that mimic images on your retina when they are far away from you, and are represented in a dynamic manner.

Steps to build the Gaussian pyramid:

According to Lowe's thesis, the construction of Gaussian pyramid is still relatively simple, Gaussian convolution and is the only linear kernel of scale transformation.

Gaussian pyramid construction process, the general first enlarge the image one times, on the basis of the enlarged image of the Gaussian pyramid, and then to the size of the Gaussian blur, a few blurred image sets constitute a eight degree, and then the eight degree of the most blurred image under the sampling process, length and width are shortened by one times, The image area turns One-fourth. This image is the initial image of the next eight degrees, the initial image of the image based on the completion of this eight-degree Gaussian blur, and so on to complete the entire algorithm required to build all eight degrees, so that the Gaussian pyramid is built. The pyramids are built as shown:

What is a scale space:

The above has been from the perspective of human visual perception of the "scale", the above also mentioned the use of Gauss kernel to achieve the scale of the transformation, then the specific implementation process, the scale reflected in where? How is it quantified? How in the Gaussian pyramid, two variables are very important, namely the number of eight degrees (o) and eight degrees of the first layer (s), the two amounts together (o,s) constitute the Gaussian pyramid scale space. Scale space is not difficult to understand, first of all, a eight-degree image of the length and width is equal, that is, the variable o control is the size of the tower in the scale; to distinguish the image at the same size scale, S is required, and s controls a different degree of ambiguity in a eight degree. This way (O,s) is able to determine the only image in the Gaussian pyramid, which is a three-dimensional space, two-dimensional coordinates, one-dimensional image.

According to Lowe's thesis, (o,s) acting on an image is through a formula

Determined. As can be seen through the formula, the scale space is continuous, two variables control the value of Δ, wherein the first eight degrees of 1< (O+S/S) <=2, the same as in the second eight degrees 2< (O+S/S) <=3, and so on, δ in the key part (O+S/S) Part is gradually increased (when implemented, some Gaussian pyramids in the value is increased, but not gradually uniform increase, can only be said to be continuous).

The scale of the first eight-degree image is δ,kδ,k^2δ ..., the second eight-degree scale is 2δ,2kδ,2k^2δ ..., and the third eight-degree scale, respectively, is 4δ,4kδ,4k^2δ ... and the other is the same. This sequence is determined by the following formula:

So each additional level eight degrees, δ will be enlarged twice times, in a eight degree, K superscript s to distinguish between different Gaussian nuclei.

At this point, the scale space in the Gaussian pyramid has been said to be almost, including the scale is what, including the Gaussian pyramid in the continuity of the scale, the latter will detail the continuity of the scale space. The image illustrates what a scale space is:

Constructing a differential Gaussian pyramid

The Gaussian pyramid is constructed for subsequent construction of the differential Gaussian pyramid. For the same eight degree of two adjacent images do not get interpolated images, all eight degrees of these interpolated images of the collection, constitutes a differential Gaussian pyramid. As shown in the process, the advantages of the differential Gaussian pyramid are convenient for the subsequent extraction of feature points.

Here, the main part of the Gaussian pyramid construction, the key points are ready, some very important cognition will be ready to go, the following explanation of the scale continuity of the space! This is the weight of the differential Gaussian pyramid!

continuity of scale space

Note here that the subject of continuity is neither the Gaussian pyramid nor the differential Gaussian pyramid, but the scale space. Before we figure out the problem, we need to solve a problem, that is, why is there s+3 amplitude articulator in every octave of the Gaussian pyramid ? s means that in the future when we find the extremum point in the differential Gaussian pyramid, we need to find the S-level point in each octave, through the Lowe paper, we can know that each layer of extreme points is in three-dimensional space (image two-dimensional, scale one-dimensional) is obtained, so in order to obtain S-layer points, then in the differential Gaussian All right, keep going. If there are s+2 images in the differential Gaussian pyramid, there must be a s+3 image in the Gaussian pyramid, because the differential Gaussian pyramid is subtracted from the two adjacent layers of the Gaussian pyramid. Well, it seems to be the truth here, but we have a fatal problem with the derivation above, and we come up with the assumption that "we are going to ask for the S-level point in every octave," why the S-layer point? This is the subject of this section: to maintain the continuity of scale ! The following is a detailed analysis:

Take a eight-degree image as an example (it is best to combine the source code of the Pyramid construction section in OpenCV < The following is listed below, you can refer to >)

Gaussian pyramid and differential Gaussian pyramid some of the formulas are also posted here:

Gaussian function g to the fuzzy function of image i:

Gaussian difference function:

Through the above two formulas, you can determine a eight-degree (in the first eight-degree example) Gaussian image and differential Gaussian image of the scale as follows (take Lowe paper for example, s=3, so each octave will have 3+3=6 image), each image of the scale is also shown in the image.

S=3 in Lowe's paper, so there are

Therefore, the scale of each Gaussian image in the current octave is:

σ,2^ (1/3) Σ, 2^ (2/3) Σ, 2^ (3/3) Σ, 2^ (4/3) Σ, 2^ (5/3) σ;

The scale of each differential Gaussian image in the current octave is:

σ,2^ (1/3) Σ, 2^ (2/3) Σ, 2^ (3/3) Σ, 2^ (4/3) σ.

In the same vein, we can infer that the scale of each Gaussian image in the next octave is:

2xσ,2x2^ (1/3) σ,2x2^ (2/3) σ,2x2^ (3/3) σ,2x2^ (4/3) σ,2x2^ (5/3) σ;

The scale of each differential Gaussian image in the next octave is:

2xσ,2x2^ (1/3) σ,2x2^ (2/3) σ,2x2^ (3/3) σ,2x2^ (4/3) σ.

It can be observed that the layer represented by the red callout data is the layer that obtains the extremum point in the differential Gaussian pyramid, that is, the operation that obtains the extreme point from the upper and lower levels only occurs on these layers. Here is a string of these red data: 2^ (1/3) Σ, 2^ (2/3) Σ, 2^ (3/3) σ,2x2^ (1/3) σ,2x2^ (2/3) σ,2x2^ (3/3) σ ... What did you find out? By the way, when these data are continuous, we achieve the scale space continuous effect by constructing three Gaussian images in each octave, the direct benefit of this effect is that we will not miss the extremum point of any one scale in the process of determining the extreme point of the scale space, but we can consider the scale factor of quantization synthetically.

Determined by every measure!

How to determine the first image of the next eight degreeThis problem is an extension of the above problem (continuity of scale space), and we can understand the problem by OpenCV the source code in this part.

The first image in the current octave is obtained from the last third image of the previous eight degree. OpenCV This source has a very important question: the different eight-degree scale is not there will be a 2 difference? Why this part of the source code does not reflect this, but in each of the eight-degree processing is the same array sig[]. First, clear the SIG array. Storage is not an absolute fuzzy kernel, but the relative fuzzy core, this is very important, since it is the relative fuzzy core, then the first image of the core is very important, so the scale of the continuous look at each eight degrees of the first image.
For the construction of the Gaussian pyramids listed below, the first image in each octave does not have a twice-fold scale leap-through process. However, this twice-fold leap is implicit in the construction of the entire Gaussian pyramid!
Then look at the third image, the scale of this image is 2^ (3/3) *δ,3/3=1, that is, in this octave, the scale of the first image is Δ, and the third-to-last image of the scale is 2*δ, just happen a 2 leap! That's why the image was sampled as a benchmark, so the initial scale of the next eight-degree first image is 2*δ.

This is the truth, and that is why the third image of the countdown is chosen for the next sample.

[CPP]View Plaincopy
  1. void Sift::buildgaussianpyramid ( const mat& Base, vector<mat>& pyr, int noctaves) Const /c2>
  2. {
  3. vector<double> Sig (Noctavelayers + 3);
  4. Pyr.resize (noctaves* (noctavelayers + 3));
  5. //Precompute Gaussian Sigmas using the following formula:
  6. //\sigma_{total}^2 = \sigma_{i}^2 + \sigma_{i-1}^2
  7. Sig[0] = Sigma;
  8. Double k = Pow (2., 1./noctavelayers);
  9. For ( int i = 1; i < noctavelayers + 3; i++)
  10. {
  11. Double Sig_prev = Pow (k, (Double) (i-1)) *sigma;
  12. double sig_total = sig_prev*k;
  13. Sig[i] = std::sqrt (Sig_total*sig_total-sig_prev*sig_prev);
  14. }
  15. For ( int o = 0; o < noctaves; o++)
  16. {
  17. For ( int i = 0; i < noctavelayers + 3; i++)
  18. {
  19. mat& DST = pyr[o* (noctavelayers + 3) + i];
  20. if (o = = 0 && i = = 0)
  21. DST = base;
  22. //base of new octave is halved image from end of previous octave
  23. Else if (i = = 0)/* The process of determining the first image in every eight degrees */
  24. {
  25. const mat& src = pyr[(o-1) * (Noctavelayers + 3) + noctavelayers];
  26. Resize (src, DST, Size (SRC.COLS/2, SRC.ROWS/2), 0, 0, inter_nearest);
  27. }
  28. Else
  29. {
  30. const mat& src = pyr[o* (noctavelayers + 3) + i-1];
  31. Gaussianblur (SRC, DST, Size (), Sig[i], sig[i]);
  32. }
  33. }
  34. }
  35. }
  36. void Sift::builddogpyramid ( const vector<mat>& gpyr, vector<mat>& dogpyr) const
  37. {
  38. int noctaves = (int) gpyr.size ()/(noctavelayers + 3);
  39. Dogpyr.resize (noctaves* (noctavelayers + 2));
  40. For ( int o = 0; o < noctaves; o++)
  41. {
  42. For ( int i = 0; i < noctavelayers + 2; i++)
  43. {
  44. const mat& SRC1 = gpyr[o* (noctavelayers + 3) + i];
  45. const mat& SRC2 = gpyr[o* (noctavelayers + 3) + i + 1];
  46. mat& DST = dogpyr[o* (noctavelayers + 2) + i];
  47. Subtract (SRC2, Src1, DST, Noarray (), datatype<sift_wt>::type);
  48. }
  49. }
  50. }

The above SIFT source code are excerpted from OpenCV nonfree module, Lowe has copyright to sift.

Sift pyramid is built, need to look for feature points in the pyramid, please pay attention to this blog Sift series Next article: Sift Analysis (ii) Location determination of feature points

Sift parsing (a) establishing a Gaussian pyramid

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.