1.Scale Space of images
The scale space expression of an image refers to the description of the image at all scales, because many processing operators in the scale space have a high degree of similarity with the sensory region profiles recorded in the outermost layers of the retina and visual cortex of mammals, the scale space theory is often associated with biological vision.
To know what makes sense in an image, you must first clarify the problem: in an image, an object is meaningful only within a certain scale. For example, the concept of a tree branch can be perceived only when the distance from a few centimeters to several meters is observed. If it is observed at the micron or kilometer level, we cannot perceive the concept of branches. In this way, we can perceive the concept of cells or forests.
Therefore, multi-scale representation is crucial if you want to describe the structure of the real world or map a 3D object to a two-dimensional image. The concept of multi-scale representation is easy to understand. For example, there is a concept of scale when creating a map. A map of the world can only display continents and oceans, as well as large regions and countries. A map of a city can even display every street in detail.
"Scale SpaceThe scale-space representation is another effective method for multi-scale representation. Its Scale Parameters are continuous, and the number of spatial sampling points on all scales is the same (in fact, A scale is an image, and the sample points in the scale space are the pixels of the scale image. That is to say, the scale space representation has the same resolution in each scale image ). The main idea of scale space representation is to generate a series of signals from the original signal (such as an image) and use these signals to represent the original signal. In this process, the detailed information is gradually smoothed out (it can be considered that the detailed information is discarded ).
Variable-scale Gaussian Functions:
Therefore, the scale space here is different from the representation of the traditional image pyramid,Scale SpaceIt can be understood that Gaussian is used to perform convolution on the image. The resolution of the image is still so large and there are still so many pixels, but the details are reduced by average (smooth), because Gaussian is used, use the weak pixels around the signal and the strong point in the middle to make the average value, of course, smaller than the strongest signal value, which plays a smooth role. The key to the traditional image pyramid is downsampling. An average of every four pixels is used as a pixel, and the resolution is obviously reduced.
As shown in:
2.Image pyramid
When it comes to multi-scale signal description, people can easily think of the signal pyramid. Pyramid is indeed the main form of multi-scale representation of images. The image pyramid is an effective but simple concept structure for interpreting images with multiple resolutions. An image pyramid was initially used for machine vision and image compression. An Image pyramid is a collection of images that gradually fall in resolution arranged in pyramid shape. As shown in.
Image pyramid consists of two steps: 1. Use a low-pass filter to smooth image 2 and sample the smooth image (for example, the next two samples) to produce a series of reduced-size images.
The low-pass filter that is often used for smooth images is a Gaussian filter. Therefore, a pyramid image obtained by Gaussian smoothing is also called a Gaussian pyramid. Koenderink, Lindeberg, florack and others use precise mathematical forms to prove that Gaussian Kernel is the only linear kernel for scale transformation. It also satisfies the translation immutability, semi-group structure, non-incrementing local extreme values, scale immutability, rotation non-deformation and other properties.