First, we can see the convolution operation. We know that convolution is an operation that multiply the template and the corresponding image vertex and then add the final result to replace the value of the template center. However, I have recently seen the definition of integral images, and I am immediately confused, so I just want to sort them out and trace them to the root.
1. convolution Image
1.1 Source
First, I found a particularly good blog post, the original article is convolution.
Paste the text to see:
Bytes ---------------------------------------------------------------------------------------------------------------
Convolution is an important operation in signal processing. When learning convolution, it is often in a continuous situation,
The convolution of the two functions f (x) and g (x), which is then f (u) g (x-u) du
Of course, it is not difficult to prove some of the properties of convolution, such as exchange and integration. However, beginners will not be very familiar with convolution.
In fact, the convolution may be clearer in discrete situations,
For the two sequences f [n] and g [n], the convolution is generally defined as s [x] = Σ f [k] g [x-k].
A typical example of convolution is actually the polynomial Multiplication operation learned in junior high school,
For example (x * x + 3 * x + 2) (2 * x + 5)
The general calculation order is as follows,
(X * x + 3 * x + 2) (2 * x + 5)
= (X * x + 3 * x + 2) * 2 * x + (x * x + 3 * x + 2) * 5
= 2 * x + 3*2 * x + 2*2 * x + 5 * x + 3*5 * x + 10
Then combine the coefficients of similar items,
2 x
3*2 + 1*5 x * x
2*2 + 3*5 x
2*5
----------
2 x + 11 x + 19 x + 10
In fact, from linear algebra, we can know that polynomials constitute a vector space, and the base can be
{1, x, x * x, x * x ,...}
In this way, any polynomial can correspond to a coordinate vector in an infinite dimension space,
For example, (x * x + 3 * x + 2) corresponds
(1 3 2 ),
(2 * x + 5) corresponds
(2, 5 ).
Linear Space does not define convolution between two vectors, but only addition and multiplication. In fact, polynomial multiplication cannot be described in linear space. it can be seen how limited the theory of linear space is.
However, if we process coordinate vectors according to the definition of vector convolution,
(1 3 2) * (2 5)
Then there is
2 3 1
_ 2 5
--------
2
2 3 1
_ 2 5
-----
6 + 5 = 11
2 3 1
2 5
-----
4 + 15 = 19
_ 2 3 1
2 5
-------
10
Or,
(1 3 2) * (2 5) = (2 11 19 10)
Return to the polynomial representation,
(X * x + 3 * x + 2) (2 * x + 5) = 2 * x + 11 * x + 19 * x + 10
It seems amazing that the results are exactly the same as what we get with the traditional method.
In other words, polynomial multiplication is equivalent to Convolution of the coefficient vector.
In fact, the reason is also very simple,
Convolution is to calculate the coefficients of x, and 1 respectively. That is to say, it merges addition and summation together. (The traditional method is multiplication first, and addition is performed only when the same category items are merged)
Take the x * x coefficient as an example to obtain x * x, or use x * x to multiply 5, or use 3x to multiply 2x, that is
2 3 1
_ 2 5
-----
6 + 5 = 11
In fact, this is the inner product of the vector. In this case, convolution can be considered as a string inner product operation. Since it is a string inner product operation, we can try to use a matrix to represent the above process.
[2 3 1 0 0]
[0 2 3 1 0 0] =
[0 0 2 3 1 0]
[0 0 0 2 3 1]
[0 0 2 5 0 0] '= x
B = Ax = [2 11 19 10]'
In the row view, if Ax is used, each row of B is an inner product.
Each row of A is A moving position of the sequence [2 3 1.
---------
Obviously, in this particular context, we know that convolution satisfies the law of exchange and combination, because, as we all know, polynomial multiplication satisfies the exchange law and the combination law. in general, it is also true.
Here, we find that in addition to forming a specific linear space, there are some special connections between the base and the base. This is the connection that gives the polynomial space special properties.
When learning vectors, we generally take this example. There are three apples, five oranges, five apples, and three oranges. Then there are several apples and oranges in total. The teacher repeatedly warned that oranges are oranges, and apples are apples. So there are (3, 5) + (5, 3) = (8, 8 ). yes, no matter how you add oranges and apples, there is no problem. However, if you consider taking oranges as oranges or taking oranges as apples, it is not easy to clarify.
Another example is the plural number. If we only define the plural number pair (a, B) and look at C2 at the linear space level, it would be too simple. In fact, you only need to add (a, B) * (c, d) = (ac-bd, ad + bc)
The situation changes immediately. It is well known that the content of the function is rich and colorful.
In addition, recalling a basic theorem in signal processing, the product of frequency domains, is equivalent to Convolution of Time-domain or airspace signals. this exactly matches the situation here. what kind of hidden state connection is behind this? You need to continue the Parameter Details.
From this point of view, high convolution operations are actually just an abstraction of elementary operations. The mathematics learned in middle school still contains many profound contents (such as exchange algebra ). It's easy to learn new things.
In fact, this is not complicated at all. Humans have been breeding for many years, but over the past n years, people only know that gender and gender are able to breed. The discovery of sperm, eggs, and reproductive mechanisms are just a few years ago.
Confucius said that in the daily use of renlun, it seems that we should look at the surrounding area with a more eye-oriented view, and even ourselves, so that we can know and know what it is.
Bytes ---------------------------------------------------------------------------------------------------------------
We have learned the source of convolution from the above. Let's look at the official definition of Convolution: An infinite integral operation for two functions in mathematics. For functions f1 (t) and f2 (t), the convolution is expressed as: formula: "*" is the convolution operator number.
In functional analysis, Convolution, spin product, or Convolution is a mathematical operator that generates a third function through two functions f and g, the accumulation of the overlapping parts of table function f and flipped and moved and g. FunctionFAndGThe convolution of is recorded as f (t) * g (t). It is the product of one function after it is flipped and translated, and is a function of the translation volume.
The convolution of function f and g can be defined as: z (t) = f (t) * g (t) = ∫ f (m) g (t-m) dm.
The convolution Theorem points out that the convolution of two-dimensional Continuous Functions in the spatial domain requires the inverse transformation of the product of the two Fourier Transformations. On the contrary, convolution in the frequency domain can be obtained by the Fourier transformation of the product in the spatial domain.
As a matter of fact, I still don't quite understand why image convolution and how to perform convolution.
1.2 convolution
When it comes to Convolution operations, the first thing that is inseparable is the convolution kernel, which is actually an array of fixed size and composed of numerical parameters. The reference point of the array is usually located in the center of the array, the size of the array is called core support. In terms of technology alone, core support is actually only composed of non-zero parts of the core array. Or, in other words, the convolution kernel is the so-called template.
Convolution is actually a process of weighted sum. each pixel in the used image area is multiplied by each element in the convolution kernel (weight matrix, the sum of all products is the new value of the regional center pixel.
Convolution example:
Convolution between 3*3 pixel region R and convolution core G:
R5 (center pixel) = R1G1 + R2G2 + R3G3 + R4G4 + R5G5 + R6G6 + R7G7 + R8G8 + R9G9
If convolution is performed on an image, the array can be used as the 3x3 convolution kernel with the center as the reference point. First, the reference point of the core is located at the first pixel of the image, and the remaining elements of the core overwrite the local pixel of the image. For each core point, we can obtain the value of this point and the value of the corresponding image point in the image. Multiply and sum these values and place the result at the position corresponding to the input image reference point. This operation is repeated for each vertex of the image by scanning the convolution kernel on the entire image. The convolution image of the image can be obtained.
Of course, we can use the equation to represent this process and define the image as I (x, y), the core as G (I, y) (where 0 <I <Mi-1, 0 <j <Mj-1) the reference point is on the (ai, aj) Coordinate, then convolution H (x, y) is defined as follows:
H (x, y) = sum [I (x + I-ai, y + j-aj) G (I, j)].
.Common templates (convolution kernel)
Continuous SpaceConvolutionThe definition is f (x)AndG (x)ConvolutionIt is the integral value of f (t-x) g (x) from negative infinity to positive infinity in t. t-x must be in the f (x) defined domain, so it seems that a lot of points are actuallyISpecified range.
The actual process is f (x) first.OneThe Y-axis is reversed, and then t (t-x) is translated along the x axis. Then g (x) is used, and the value of the product of the two is integral.IF (t-x)AndThe area of the intersection of g (x ).ConvolutionNow.
Replacing the integral symbol with the sum is a discrete space.ConvolutionDefined.
1.3 Significance
Convolution is the basis of various image transformations. The functions implemented by a special convolution are determined by the form of its convolution kernel (Template. Gaussian transformation is the convolution of images using Gaussian Functions.
Smooth Processing: smoothing type: simple fuzzy (summation of the neighborhood and zooming), simple non-scaling fuzzy (summation of the neighborhood), median fuzzy (median filter ), gaussian fuzzy (Gaussian convolution) and bilateral fuzzy (bilinear filtering ).
Expansion and corrosion.
Image pyramid
Laplace transformation, and the canny operator (derivative)
Convolution:
1. the symbolic representation of convolution indicates that convolution is a special type of multiplication. Multiplication can be used for convolution.
(1) commutative law: f1 (t) * f2 (t) = f2 (t) * f1 (t)
(2) distributive law: f1 (t) * [f2 (t) + f3 (t)] = f1 (t) * f2 (t) + f1 (t) * f3 (t)
(3) associative law: [f1 (t) * f2 (t)] * f3 (t) = f1 (t) * [f2 (t) * f3 (t)]
(4) shift invariace: If f1 (t) * f2 (t) = f3 (t), then f1 (t-t0) * f2 (t) = f1 (t) * f2 (t-t0) = f3 (t-t0)
This property indicates that no matter which function translates a distance t0, the resulting convolution simply translates the same distance, but the size and shape remain unchanged.
2. convolution differentiation and integration
(1) The derivative after the convolution of the two functions is equal to the derivative of one of the two functions.
(2) The convolutional points of the two functions are equivalent to the convolutional points of one of the two functions.
(3) promotion
If so, the two functions are convolution of the derivative of the m and n functions, which are given by the (m + n) derivatives of their convolution.
3. convolution of singular signals:
(1) f (t) * delta (t) = f (t) * delta (t-t0) = f (t-t0) f (t-t1) * delta (t-t0) = f (t-t0-t1)
(2) Delta (t) * delta (t) = delta (t)
(3) f (t) * Delta '(t) = f' (t)
Promotion: f (t) * delta (k) (t) = f (k) (t) f (t) * delta (k) (t-t0) = f (k) t-t0)
(5) f (t) * Delta '(t) * u (t) = f' (t) * u (t) = f (t) * Delta ''(t) * tu (t) = f'' (t) * tu (t) = f (t)
So what is the relationship between image convolution and integral images? Inform me ......
2 integral Images
The concept of integral image is used in the surf algorithm. Using Integral images, the image and Gaussian second-order differential templates are filtered and converted into addition and subtraction operations for integral images. The concept of integrated Image was proposed by viola and Jones, while the use of integrated images for Box filtering was proposed by Simard and others.
The value of any point (I, j) in the integral image is ii (I, j), which is the sum of the gray value of the diagonal area corresponding to the given point (I, j) in the upper left corner of the original image:
In the formula, I (x', y') indicates the gray value of the midpoint (I ', J') of the original image, ii (x, y) the following two formulas can be used for iterative calculation:
In the formula, S (x, y) indicates the credits of a column, and S (I,-1) = 0, ii (-1, j) = 0. for an integral image, you only need to scan all pixels of the original image. The following code is the implementation of c ++.
POutImage [0] [0] = pInImage [0] [0];
For (int x = 1, x <nWidth; I ++)
{
POutImage [x] [0] = pInImage [x-1] [0] + pInImage [x] [0];
}
For (int y = 1; y <nHeight; y ++)
{
Int nSum = 0;
For (int x = 0; x <nWidth; x ++)
{
NSum = pInImage [x] [y];
POutImage [x] [y] = pInImage [x] [Y-1] + nSum;
}
}
Indicates that the four corresponding points (i1, j1) (i2, j2) of the integral image can be used regardless of the size of the pixel in the window w) (i3, j3) (i4, j4) values are calculated. That is to say, the pixel gray scale in the window W is irrelevant to the window size. Gray sum of pixels in window W
Sum (W) = ii (i4, j4)-ii (i2, j2)-ii (i3, j3) + ii (i1, j1)
As shown in the following figure, I believe all of them can be understood.
The sum of pixels in the rectangle area should be a simple and repetitive operation, which improves the overall efficiency. Why? Assuming that an image has n pixels in total, the total addition operation of the integral graph at n locations is n-1 (note: this is not the case, so we must make full use of the recursive thinking ), save these results in a matrix M corresponding to the source image. When we need to calculate the sum of all pixels in A rectangle area in an image, we can call up the integral graph values of A, B, C, and D just like A look-up table, simple addition and subtraction (note that only three times are required) to get the result. If you use the naive method to sum up a rectangle in the original image, how many possible combinations of rectangles are there? !! N is quite large for an image, SO 2 ^ n
That's an astronomical number, and most of the rectangles overlap. What does overlap mean? When calculating the sum, there is repetitive work. In fact, we can effectively use the computed information. This is the internal idea of the integral graph method: it is actually to calculate the sum of pixels in n rectangle regions that do not overlap with each other (professional points are not intersecting) and make full use of these values (existing values) calculating unknown values is a bit similar to recursive taste... this completely avoids repeated sum operations.
In this way, two types of operations can be performed:
(1) pixel points in any rectangular area. The integral graph of the image allows you to quickly calculate the gray-scale credits of all pixels in any rectangle of the image. As shown in 2.3, the value of point 1's integral image ii1 is (Sum is the Sum ):
Ii1 = Sum ()
Similarly, points 2, 3, and 4 have the following integral images:
Ii2 = Sum (A) + Sum (B); ii3 = Sum (A) + Sum (C); ii4 = Sum (A) + Sum (B) + Sum (C) + Sum (D );
The gray points of all pixels in Area D of the rectangle can be obtained from the integral images of the rectangle endpoint:
Sum (D) = ii1 + ii4-(ii2 + ii3) (1)
(2) feature value calculation
The feature value of a rectangle is the difference between two pixels in different rectangular regions. The feature value of any rectangle can be calculated in the (1) formula, the following uses feature Prototype A in figure 2.1 as an example to describe the calculation of feature values.
As shown in 2.4, the feature value of the feature prototype is defined:
Sum (A)-Sum (B)
According to the formula (1), Sum (A) = ii4 + ii1-(ii2 + ii3); Sum (B) = ii6 + ii3-(ii4 + ii5 );
Therefore, the feature features of this type of feature prototype are:
(Ii4-ii3)-(ii2-ii1) + (ii4-ii3)-(ii6-ii5)
In addition, the integral graph can be used to quickly calculate the Sum of all pixel values of a given rectangle, Sum (r ). Assuming r = (x, y, w, h), the sum of all elements in the rectangle is equivalent to the formula below in the integral diagram below:
Sum (r) = ii (x + w, y + h) + ii (x-1, Y-1)-ii (x + w, Y-1)-ii (x-1, y + h)
It can be seen that the rectangular feature value calculation is only related to the integral graph of the feature endpoint, but not to the image coordinate value. For rectangular features of the same type, regardless of the scale and location of the feature, the calculation of the feature value takes constant time and is only a simple addition and subtraction operation. The Calculation of feature values of other types is similar.