First see the convolution operation, know that convolution is the template and image corresponding points multiplied and added, the final result in lieu of the value of the template center point of an operation. However, recently saw the definition of integral image, immediately dizzy vegetables, so tidy up a bit, traced it.
1 convolution image
1.1 Source
We first found a very good blog post, the original: convolution
Posted in the text to see:
---------------------------------------------------------------------------------------------------------------
One of the important operations in signal processing is convolution. When a beginner convolution, it is often in a continuous situation,
Two functions f (x), g (x) convolution, is ∫f (U) g (x-u) du
Of course, it is not difficult to prove some of the properties of convolution, such as exchange, Union, and so on, but for convolution operations, the beginner is unclear.
In fact, it may be clearer to look at convolution from a discrete situation,
For two sequence f[n],g[n], it is generally possible to define the convolution as s[x]=∑f[k]g[x-k]
A typical example of convolution, in fact, is the multiplication of the polynomial multiplied by the middle school,
For example (x*x+3*x+2) (2*x+5)
The general calculation Order is this,
(x*x+3*x+2) (2*x+5)
= (x*x+3*x+2) *2*x+ (x*x+3*x+2)
= 2*x*x*x+3*2*x*x+2*2*x+ 5*x*x+3*5*x+10
Then merge the coefficients of the similar terms,
2 x*x*x
3*2+1*5 x*x
2*2+3*5 x
2*5
----------
2*x*x*x+11*x*x+19*x+10
In fact, it is known from linear algebra that the polynomial forms a vector space whose base is optionally
{1,x,x*x,x*x*x,...}
Thus, any polynomial can correspond to a coordinate vector in an infinite-dimensional space,
For example, (x*x+3*x+2) corresponds to
(1 3 2),
(2*x+5) corresponds to
(2,5).
In a linear space, there is no convolution operation between two vectors, but only addition, multiply by two operations, and in fact, the multiplication of polynomial can not be described in linear space. How limited is the theory of visible linear space.
But if we deal with the coordinate vectors as defined by our upper face vector convolution,
(1 3 2) * (2 5)
Then there are
2 3 1
_ _ 2 5
--------
2
2 3 1
_ 2 5
-----
6+5=11
2 3 1
2 5
-----
4+15 =19
_ 2 3 1
2 5
-------
10
Or say,
(1 3 2) * (2 5) = (2 11 19 10)
Back to the expression of the polynomial,
(x*x+3*x+2) (2*x+5) = 2*x*x*x+11*x*x+19*x+10
It seems magical, and the result is exactly the same as what we get in the traditional way.
In other words, the polynomial multiplies, which is equivalent to the convolution of the coefficient vectors.
In fact, pondering, the reason is very simple,
The convolution operation is actually a coefficient of x*x*x, x*x,x,1, that is to say, he has made the addition and summation mixed together. (The traditional approach is to do multiplication first and then add when merging similar terms)
Take the coefficient of x*x as an example, get x*x, or use X*x by 5, or 3x by 2x, that is
2 3 1
_ 2 5
-----
6+5=11
In fact, this is the inner product of the vector. So, the convolution operation can be regarded as a series of inner product operations. Since it is a series of inner product operations, we can try to represent the above process with a matrix.
[2 3 1 0 0 0]
[0 2 3 1 0 0]==a
[0 0 2 3 1 0]
[0 0 0 2 3 1]
[0 0 2 5 0 0] ' = = X
b= ax=[2 11 19 10] '
With a line view of AX, each line of B is an inner product.
Each row of a is a moving position of the sequence [2 3 1].
---------
Clearly, in this particular context, we know that convolution satisfies the law of exchange, binding, because, well-known, polynomial multiplication satisfies the commutative law, the binding law. In the general case, it is actually established.
Here, we find that the polynomial, in addition to the formation of a specific linear space, there is a special relationship between the base and the base, it is this connection, given the polynomial space with a special nature.
When learning vectors, generally will give this example, a has three apples, 5 oranges, B has 5 apples, three oranges, then there are a few apples, oranges. The teacher repeatedly warned that oranges are oranges, apples are apples, can not be mixed together. So there are (3,5) + (5,3) = (8,8). Yes, oranges and apples are no problem, but it's not easy to say if you think about oranges or oranges and apples.
Again, if you just define a complex number pair (A, a, b), it is simply too simple to look at C2 at a linear space level. In fact, just add one (a, b) * (c,d) = (AC-BD,AD+BC)
It is well known that the content of complex functions is very rich and colorful.
In addition, recall a basic theorem in signal processing, the product of the frequency domain, which corresponds to the convolution of the time domain or the spatial signal. Exactly the same as the situation here. What kind of implicit relationships exist behind this, and you need to continue with the details.
From this point of view, the high convolution operation is nothing more than an abstraction of an elementary operation. The mathematics in the middle school, in fact, contains many advanced content (such as commutative algebra). It is not absurd to know the new words.
In fact, this truth is not complicated, how many years of human reproduction, but in the past N decades, people only know that men and women seduced sperm, but can reproduce offspring. Sperm, the discovery of eggs, the study of reproductive mechanisms, that is, the last few years of things.
Confucius said that the Tao in the daily human relations, it seems that we should look at the eyes of the surrounding, and even ourselves, to know it, and know its why.
---------------------------------------------------------------------------------------------------------------
From the above we learned the source of the convolution. Let's find out the official definition of convolution: an infinite integral operation for two functions in mathematics. For functions F1 (T) and F2 (t), their convolution is expressed as: "*" is a convolution operation symbol.
In functional analysis, convolution (convolution), convolution, or convolution (English: convolution) is a mathematical operator that generates a third function through two functions f and G, with the accumulation of a function f with the overlapping portion of a flip and pan and G. The convolution of the function F and G is written as F (t) *g (t), which is the integral of one of the functions that is flipped and translated, and the product of another function, is a function of the amount of translation.
The convolution of the function f and g can be defined as: Z (t) =f (t) *g (t) =∫f (m) g (t-m) DM.
The convolution theorem indicates that the convolution of two two-dimensional continuous functions in the spatial domain can be obtained by the inverse transformation of the corresponding two Fourier transform products. Conversely, the convolution in the frequency domain is available in the spatial domain of the product of the Fourier transform.
In fact, to say so much, I still do not quite understand why the image to convolution, how to convolution.
1.2 Convolution operation
In the case of convolution operations, the first thing is convolution kernel, which is actually a fixed size, a numerical parameter of the array, the reference point of the array is usually located in the center of the array, the size of the array is called the nuclear support. As far as technology is concerned, nuclear support is actually made up of not 0 parts of the nuclear array alone. Or, like other claims, convolutional nuclei are called templates.
Convolution operations, in fact, can be regarded as a weighted summation process, each pixel in the image area used to multiply each element of the convolution nucleus (weight matrix), and the sum of all products as the new value of the regional center pixel.
Convolution Example:
3 * 3 of the pixel area R with convolution kernel G convolution operation:
R5 (center pixel) =R1G1 + r2g2 + r3g3 + r4g4 + r5g5 + r6g6 + r7g7 + R8g8 + r9g9
If convolution is performed on an image, the 3*3 convolution core can be used with the reference point of the array as the center. First, the reference point of the nucleus is positioned at the first pixel point of the image, and the remaining elements of the nucleus cover the total corresponding local pixels of the image. For each nuclear point, we can get the value of this point and the value of the corresponding image point in the image. Multiply and sum the values and place the result in the position corresponding to the input image reference point. This action is repeated for each point of the image by scanning the convolution core over the entire image. Finally, you can get the convolution image of the image.
Of course, we can use the equation to represent the process, define the image as I (x, y), the nucleus is G (i,y) (where the 0<i<mi-1,0<j<mj-1) reference point is located at (ai,aj) coordinates, then the convolution h (x, y) is defined as follows:
H (x, y) = sum[I (x+i-ai,y+j-aj) G (I,J)].
. Common templates (convolutional cores)
The convolution definition of a continuous space is f (x) and the convolution of G (x) is the integral value of f (t-x) g (x) in T from negative infinity to positive infinity. T-x to be in the f (X) Definition field, so the large-looking integrals are actually fixed range.
The actual process is f (x) to do a y-axis inversion, and then the x-axis to translate T is f (t-x), and then the G (x), the value of the product of the two integrals. Imagine if g (X) or F (x) is a unit of the order-function. So is the area of F (t-x) and g (x). This is the convolution.
Changing the integral symbol to sum is the convolution definition of the discrete space.
1.3 Significance
Convolution is the basis of various image transformations, and the function of a special convolution is determined by the form of its convolution kernel (template). Gaussian transform is to use Gaussian function to convolution the image.
Smoothing: Smoothing Type: simple blur (sum of neighbors and scale), simple blur without scaling (sum of neighborhood), median blur (median filter), Gaussian blur (Gaussian convolution), bilateral blur (bilinear filtering).
Expansion and corrosion.
Image Pyramid
Laplace transform, canny operator (derivative number)
Convolution properties:
1. The symbolic representation of the convolution indicates that convolution is a special type of multiplication, and some algebraic properties of multiplication are available for convolution.
(1) commutative law:f1 (t) *f2 (t) =f2 (t) *f1 (t)
(2) Distributive law:f1 (t) *[f2 (t) +f3 (t)]=f1 (t) *f2 (t) +f1 (t) *f3 (t)
(3) Associative law:[f1 (t) *f2 (t)]*f3 (t) =f1 (t) *[f2 (t) *f3 (t)]
(4) Shift Invariace: If F1 (t) *f2 (t) =f3 (t), then F1 (t-t0) *f2 (t) =f1 (t) *f2 (T-T0) =f3 (t-t0)
This property indicates that no matter which function translates a distance t0, the resulting convolution is simply panning the same distance, but the size and shape remain unchanged.
2. Convolution differential and integral
(1) The derivative of the convolution of the two functions is equal to the derivative of one of the two functions and the convolution of another function.
(2) The integral of the two functions after convolution is equal to one of the two functions and the other function is convolution.
(3) Promotion
If F1 (t) *f2 (t) = s (t), then, two are respectively m-order and N-order derivative convolution, given by their convolution (m+n) Order derivative.
3. Convolution characteristics of singular signals:
(1) F (t) *δ (t) =f (T) f (t) *δ (T-T0) =f (t-t0) F (t-t1) *δ (t-t0) =f (T-T0-T1)
(2) δ (t) *δ (t) =δ (t)
(3) F (t) *δ ' (t) =f ' (t)
(4)
Promotion: F (t) *δ (k) (t) =f (k) (T) f (t) *δ (k) (T-t0) =f (k) (T-T0)
(5) F (t) *δ ' (t) *u (t) =f ' (t) *u (t) =f (T) f (t) *δ "(t) *tu (t) =f" (t) *tu (t) =f (t)
So, what does image convolution have to do with integral images? The person who knows tells me ...
2 points Image
The concept of integral image is used in the surf algorithm. With the help of integral image, the filtering of image and Gauss differential template is transformed into the addition and subtraction of integral image. The concept of integral images (Integral image) is presented by Viola and Jones, and the use of similar integral images for box filtering is presented by Simard and others.
The value of any point (I,J) in the integral image is II (I,J) as the sum of the gray values of the corresponding diagonal areas in the upper left corner of the original image to any point (I,J):
In the formula, I (x ', Y ') represents the grayscale value of the midpoint (i ', J ') of the original image, and II (x, y) can be computed by iterating over the following two formulas:
In the formula, S (x, y) represents the integral of a column, and S (i,-1) =0,ii ( -1,j) =0. To calculate an integral image, simply scan all pixels of the original image once. The following code is the implementation of the C + + language
Poutimage[0][0] = pinimage[0][0];
for (int x = 1, x < nwidth; i++)
{
Poutimage[x][0] = pinimage[x-1][0] + pinimage[x][0];
}
for (int y=1; y< nheight; y++)
{
int nSum = 0;
for (int x=0; x < nwidth;x++)
{
NSum = Pinimage[x][y];
Poutimage[x][y]= pinimage[x][y-1]+nsum;
}
}
Indicates that the value of the 4 corresponding points (i1,j1) (I2,J2) (I3,J3) (I4,J4) of the integral image can be computed, regardless of the size of the window W, when the cell grayscale in the window w is being obtained. In other words, it is irrelevant to find the grayscale of the cell in the window W and the size of the window. The grayscale of the cells within the window W and the
Sum (W) = II (I4,J4)-Ii. (I2,J2)-II (I3,J3) + II (I1,J1)
See below, believe all can understand
The summation of pixels in a rectangular region should be a simple repetitive operation, which generally improves the efficiency. Why do you say that? Assuming that a picture has n pixels, the total number of add operations for the calculation of the n locations is n-1 (note: Not a second, to make the most of the recursive idea), save the results in a matrix m corresponding to the original image. When you need to calculate the sum of all the pixels in a rectangular area of the image is directly like tabular, call up the A,b,c,d four points of the integration graph value, simple addition and subtraction (note that only three times OH) to get the results. Conversely, if you use naive to sum directly within a rectangular area of the original image, think about the total number of possible combinations of rectangles?!! And for an image n that's pretty big, so 2^n
That's astronomical, and most of the rectangles there overlap, what does overlap mean? In the calculation of the sum of time there is a repetitive work, in fact, we can effectively use the information already calculated. This is the intrinsic idea of the integral graph method: It actually calculates the sum of pixels in the rectangular area of n non-overlapping (professional points are disjoint), taking full advantage of these values (existing values) to calculate the unknown value, a bit like a recursive flavor ... This avoids the repetition of the sum operation altogether.
This allows you to perform 2 kinds of operations:
(1) pixel integral in any rectangular region. The integral graph of the image can be used to calculate the gray integral of all pixels in the image conveniently and quickly. As shown in 2.3, point 1 of the integral image ii1 the value of (where sum is summed):
Ii1=sum (A)
Similarly, point 2, point 3, point 4 of the integration image is:
Ii2=sum (A) +sum (B); Ii3=sum (A) +sum (C); Ii4=sum (A) +sum (B) +sum (C) +sum (D);
The integral image of all pixels in the rectangular region D can be made up of the points of the rectangular endpoints:
Sum (D) =ii1+ii4-(II2+II3) (1)
(2) Characteristic value calculation
The characteristic value of the rectangle feature is two different rectangular region pixels and the difference, by the (1) formula can calculate the characteristic value of any rectangle characteristic, below takes the characteristic prototype a in Figure 2.1 as an example to explain the eigenvalue calculation.
As shown in 2.4, the characteristic value of the feature prototype is defined as:
Sum (A)-sum (B)
According to (1) The formula is: Sum (A) =ii4+ii1-(II2+II3); Sum (B) =ii6+ii3-(II4+II5);
Therefore, the characteristic values of this type of feature prototype are:
(II4-II3)-(II2-II1) + (II4-II3)-(II6-II5)
Another example: using the integral graph can quickly calculate the sum of all the pixels of a given rectangle (r). Assuming r= (x,y,w,h), then the sum of all the elements inside this rectangle is equivalent to the following formula in the integral graph below:
Sum (r) = II (X+W,Y+H) +ii (x-1,y-1)-Ii. (x+w,y-1)-Ii. (X-1,Y+H)
Thus, the calculation of rectangular eigenvalue is only related to the integral graph of this feature endpoint, and is independent of the image coordinate value. For a rectangular feature of the same type, regardless of the scale and position of the feature, the time taken to calculate the eigenvalue is constant, and it is simply a subtraction operation. Other types of eigenvalues are computed in a similar way.
Introduction to Convolution