Some knowledge points of image convolution and filtering

Source: Internet
Author: User
Tags dashed line what magic

Some knowledge points of image convolution and filtering

[Email protected]

Http://blog.csdn.net/zouxy09

Before learning CNN, there are some study and collation of convolution, later on the rotten tail, and now a little tidying up, first put up to remind and exchange.

First, the basic concept of linear filtering and convolution

Linear filtering can be said to be the most basic method of image processing, it can allow us to process the image, resulting in a lot of different effects. The practice is simple. First, we have a two-dimensional filter matrix (with a tall name called convolution kernel) and a two-dimensional image to be processed. Then, for each pixel of the image, the product of its neighborhood pixel and the corresponding element of the filter matrix is computed and then added up as the value of the pixel position. This completes the filtering process.

Multiplying the image and the filter matrix by element multiplication and summing is equivalent to moving a two-dimensional function to all the positions of another two-dimensional function, which is called convolution or association correlation. The difference between convolution and co-correlation is that convolution requires a 180 rollover of the filter matrix, but if the matrix is symmetric, then there is no difference between the two.

Correlation and convolution can be said to be the most basic operation of image processing, but it is very useful. These two operations have two key features: they are linear and have translational invariance shift-invariant. Translational invariance means that we perform the same action at every point in the image. Linear refers to this operation is linear, that is, we use a linear combination of the neighborhood of each pixel to replace this pixel. These two properties make this operation very simple, because the linear operation is the simplest, and it is easier to do the same in all places.

In fact, in the field of signal processing, convolution has a wide range of meanings, and there are strict mathematical definitions, but this is not a concern here.

The 4-double loop requires 4 nested loops, so it's not fast unless we use very small convolution cores. 3x3 or 5x5 is commonly used here. Also, for filters, there are certain rules that require:

1) The size of the filter should be odd so that it has a center, such as 3x3,5x5 or 7x7. There is the center, also has the name of the radius, for example, the size of the kernel of the 5x5 radius is 2.

2) The sum of all elements of the filter matrix should be equal to 1, which is to ensure that the brightness of the image remains unchanged before and after filtering. Of course, this is not a mandatory requirement.

3) If the sum of all elements of the filter matrix is greater than 1, then the filtered image will be brighter than the original image, conversely, if it is less than 1, then the resulting image will be dimmed. If and to 0, the image will not darken, but it will also be very dark.

4) for a filtered structure, there may be negative or greater than 255 values. In this case, we will truncate them directly between 0 and 255. For negative numbers, you can also take absolute values.

Second, the Magic convolution core

As mentioned above, the image filtering process is to apply a small convolution core to the image, that small convolution nucleus in the end what magic, can make an image from the appalling to become delicious. Let's take a glimpse of some simple, but not simple, convolution-kernel magic.

2.1, do not do anything

Haha, what can we see? The filter does nothing, and the resulting image is the same as the original. Because only the value of the center point is 1. The weighted value of the neighboring point is 0, which has no effect on the filtered value.

Let's move on to the next point.

2.2. Image Sharpening Filter sharpness filter

Image sharpening and edge detection is very much like, first find the edge, and then add the edge to the original image above, thus strengthening the edge of the image, so that the image looks sharper. The combination of the two operations is sharpening the filter, that is, on the basis of the edge detection filter, and then in the center of the position plus 1, so that the filtered image and the original image will have the same brightness, but will be more sharp.

If we increase the nuclear, we can get a finer sharpening effect.

In addition, the following filters will emphasize the edges more:

The main emphasis is on the details of the image. The simplest 3x3 sharpening filter is as follows:

The difference between the current point and the surrounding point is actually calculated, and then the difference is added to the original position. In addition, the weighted value of the middle point is greater than all weights and more than 1, which means that the pixel remains the original value.

2.3 Edge Detection Edges Detection

We want to find the edge of the level: it should be noted that the element of the matrix here and is 0, so the filtered image will be very dark, only the edge of the place is bright.

Why is this filter able to find the horizontal edge? Because the convolution of this filter is equivalent to a discrete version of the derivative: you subtract the current pixel value from the previous pixel value, so you can get the difference or slope of the function in both positions. The following filter can find the vertical edge, where the pixel values on and below the pixels are used:

The next filter can find the edge of 45 degrees: take-2 not for what, just to let the matrix element and 0.

The filter below can then detect the edges in all directions:

To detect edges, we need to calculate the gradient in the direction of the image. Convolution the image with the following convolution core, it is possible. But in practice, this simple method will amplify the noise. Also, it should be noted that all the values of the matrix add up to 0.

2.4. Embossed Embossing Filter

The Emboss filter can give an image a 3D shadow effect. Just subtract the pixels from the other side of the pixel on the center side. At this point, the pixel value may be negative, we use negative numbers as shadows, positive numbers as light, and then we offset the resulting image by 128. At this point, most of the images become gray.

Here's a 45-degree emboss filter.

We just need to increase the filter to get a more dramatic effect.

The effect is pretty, like carving an image on a stone and illuminating it in one direction. Unlike the previous filter, it is asymmetric. In addition, it produces negative values, so we need to offset the results to get the range of image grayscale.

A: the original image. B: Sharpening. C: Edge detection. D: Emboss

2.5, mean value fuzzy box Filter (averaging)

We can average the current pixel with the pixels of its neighborhood domain, then divide by 5, or take 0.2 of the value directly in the 5 places of the filter, such as:

As you can see, this blur is still relatively gentle, and we can make the filter bigger so that it will become rough: note to divide and 13.

So, if you want a more blurry effect, increase the size of the filter. Or you can apply multiple blur to an image.

2.6. Gaussian Blur

The mean blur is simple, but not very smooth. Gaussian Blur has this advantage, so it is widely used in image noise reduction. Especially before edge detection, it is used to remove details. The Gaussian filter is a low-pass filter.

2.7. Motion Blur Motion Blur

Motion blur can be achieved by blurring only in one direction, such as the 9x9 motion blur filter below. Note that the summation result is divided by 9.

The effect is as if the camera is moving from the upper-left corner to the lower-right corner.

Third, the calculation of convolution

for image processing, there are two main types of methods: airspace processing and frequency domain processing! Airspace processing refers to the calculation of the original pixel space directly, and the frequency processing refers to the transformation of the image into the frequency domain, and then the processing of filtering.

3.1. Airspace calculation-Direct 2D convolution

3.1.1, 2D convolution

The direct 2D convolution is the first to say that for each pixel of the image, the product of its neighborhood pixel and the corresponding element of the filter matrix is calculated, and then added together as the value of the pixel position.

The direct implementation is also known as violence to achieve brute force, because it is implemented strictly by definition, without any optimizations. Of course, in parallel implementation, it is also more flexible. In addition, there is an optimized version, if our kernel is separable can be divided, then we can get a fast 5 times times about the convolution method.

2.1.2, boundary processing

What happens to the edge of the image when the convolution core is encountered? Example of a pixel at the top, which has no pixels above it, how is its value calculated? At present there are four kinds of mainstream processing methods, we use one-dimensional convolution and mean filter to illustrate the next.

In a 1D image, we replace its value with the average of each pixel and its two neighborhood. Let's say we have a 1D image of this:

The operation of pixels on non-image boundaries is relatively straightforward. Suppose we do a local average for the fourth pixel of I, 3. That is, we use 2,3 and 7 to do the average, to replace the pixel value of this position. That is, the average will produce a new image J, the image in the same position J (4) = (I (3) +i (4) +i (5))/3 = (2+3+7)/3 = 4. Similarly, we can get J (3) = (I (2) +i (3) +i (4))/3 = (4+2+3)/3 = 3. It is important to note that each pixel of the new image is dependent on the old image, and that J (3) is wrong when calculating J (4), instead of I (3), I (4), and I (5). So each pixel is the average of it and its neighborhood two pixels. The average is a linear operation, because each new pixel is a linear combination of the old pixels.

For convolution, it is also important to consider what to do when it comes to image boundaries. What should the value of J (1) be? It depends on I (0), I (1) and I (2). But we don't have I (0)! There is no value left in the image. There are four ways to deal with this problem:

1) The first is that imagine I is part of an infinitely long image, except for the portion of the value we give, and the pixel value of the other part is 0. In this case, I (0) = 0. So J (1) = (I (0) + I (1) + I (2))/3 = (0 + 5 + 4)/3= 3. Similarly, J (+) = (I (9) +i (+i) (11))/3 = (6 + 0)/3 = 3.

2) The second method is also imagined I is part of an infinite image. But the part that is not specified is expanded with the value of the image boundary. In our example, because the leftmost value of the image I (1) = 5, so it has all the values on the left side, which we all think is 5. And all the values on the right side of the image, we all think that the value I (10) of the right boundary is 6. This time j (1) = (I (0) + I (1) + I (2))/3 = (5 + 5 + 4)/3= 14/3. and J (Ten) = (I (9) +i (+i) (11))/3 = (3 + 6 + 6)/3 = 5.

3) The third case is that the image is periodic. That is, I keep repeating. The period is the length of I. In our case, the values of I (0) and I (10) are the same, and the values of I (11) and I (1) are the same. So J (1) = (I (0) + I (1) + I (2))/3= (I (Ten) + I (1) + I (2))/3 = (6 + 5 + 4)/3 = 5.

4) The last situation is no matter where else. We feel that the situation outside of I is undefined, so there is no way to use these undefined values, so there is no way to calculate pixels that use values that are not defined by the image I. Here, J (1) and J (10) are not able to calculate, so the output J will be smaller than the original image I.

These four methods have their own pros and cons. If we imagine that the image we are using is just a small window of the world, and then we need to use values outside the bounds of the window, then generally, the values on the outside are almost similar to the values on the boundary, so the second method might be more plausible.

2.2. Frequency domain calculation-FFT convolution with fast Fourier transform

This fast implementation benefits from the convolution theorem: the convolution on the time domain equals the product on the frequency domain. So we transform our images and filters through the algorithm to the frequency domain, directly multiply them, and then transform back to the time domain (that is, the image of the airspace).

o indicates that the matrix is multiplied by element. So how to transform the spatial image and filter into the frequency domain. That is the famous Fast Fourier transformation FFT (in fact, in Cuda, has implemented the FFT).

To filter a pair of images in the frequency domain, the size of the filter and the size of the image must match, so that the two can be easily multiplied. Because the size of the general filter is smaller than the image, we need to expand our kernel so that it matches the size of the image.

Because the FFT implementation in Cuda is periodic, the value of kernel is also arranged to support this periodicity.

We also need to expand our input image in order to ensure that the pixel of the image boundary can also get the response output. At the same time, the way to expand also to support periodic expression.

If we just use the convolution theorem and don't make any changes to the input, then we get the result of the cyclic convolution. But this may not be what we want because the periodic convolution fills the input data and introduces some artifacts.

Given the n-length of I and K, in order to get a linear convolution, we need zero padding for I and K. Why 0 is needed because the DFT assumes that the input is infinite and periodic, and that the period is N.

For example, for I and K, if there is no padding, the implication is that I and k are periodic, with their length n being the period. The original n-lengths of I and K are all black dashed parts, and then if there is no padding, implicit will be outside N, plus the same countless I, such as the red dashed part, plus a cycle. So it is with K. If it's zero padding, it's all 0 in the black dotted line, in the blue part. I and K convolution, if not padding, such as the black dashed line, there will be the red part of the artifact. If there is a padding, it is the solid blue line.

Four, the experimental code

This is the second part of the MATLAB Experiment Code:

[Python]View PlainCopy
  1. Clear,close All, CLC
  2. Percent Readimage
  3. Image =imread (' test.jpg ');
  4. Percent Definefilter
  5. %-----Identity Filter-----
  6. Kernel =[0, 0, 0
  7. 0, 1, 0
  8. 0, 0, 0];
  9. %-----Average Blur-----
  10. Kernel =[0, 1, 0
  11. 1, 1, 1
  12. 0, 1, 0]/ 5;
  13. %-----Gaussian Blur-----
  14. Kernel =fspecial (' Gaussian ', 5, 0.8);
  15. %-----Motion Blur-----
  16. Kernel =[1, 0, 0, 0, 0
  17. 0, 1, 0, 0, 0
  18. 0, 0, 1, 0, 0
  19. 0, 0, 0, 1, 0
  20. 0, 0, 0, 0, 1]/ 5;
  21. %-----Edges Detection-----
  22. Kernel =[-1,-1,-1
  23. -1, 8,-1
  24. -1,-1,-1];
  25. %-----Sharpen Filter-----
  26. Kernel =[-1,-1,-1
  27. -1, 9,-1
  28. -1,-1,-1];
  29. %-----Emboss Filter-----
  30. Kernel =[-1,-1, 0
  31. -1, 0,1
  32. 0, 1,1];
  33. Percent convolethe image with defined kernel or filter
  34. Result =zeros (size (image));
  35. Result (:,:, 1) = Conv2 (double (image (:,:, 1)), double (kernel), ' same ');
  36. Result (:,:, 2) = Conv2 (double (image (:,:, 2)), double (kernel), ' same ');
  37. Result (:,:, 3) = Conv2 (double (image (:,:, 3)), double (kernel), ' same ');
  38. Percent Showthe Result
  39. Imshow (image);
  40. Figure
  41. Imshow (uint8 (Result))

v. References

[1] Correlation and convolution.pdf

[2] Lode ' s computer graphicstutorial Image Filtering

Top
-
Step
1

Some knowledge points of image convolution and filtering (turn)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.