2008 in a PS discussion group, there are netizens puzzled Photoshop's Gaussian blur in the radius of what meaning, so I wrote this article:
A summary of the algorithm for the Photoshop Gaussian blur filter;
In that article, we mainly explain the meaning of the radius in Gaussian blur, the square root of the variance of the two-dimensional normal distribution, and give the theoretical description of the algorithm. Now I'm going to implement this algorithm in C + +, so I have the following DEMO.
At first I was based on the algorithm theory directly, that is, using the Ivigos template, the results found that the processing time is very long, to a picture can reach about a few minutes long. This is certainly not right, so I Baidu a bit, found that the problem should be used two times a Gaussian fuzzy can be [1], so that the algorithm's time complexity of a factor, from O (σ^2) reduced to O (Σ). This algorithm then increases the speed to the millisecond level. The following table shows the cost comparison of the two-dimensional fuzzy primitive method and the two-time one-dimensional fuzzy summation method:
Algorithm |
Complexity of Time |
Complexity of space |
(1) Ivigos Blur |
O (σ^ 2) * O (n) (slow) |
O (σ^ 2) (smaller) |
(2) two times a Gaussian fuzzy accumulation |
O (Σ) * O (n) (FAST) |
O (n) + O (σ) ≈o (n) (larger) |
Where: Σ: Variance square root (Gaussian blur radius in Photoshop) n = w * H (number of pixels in the picture). The exact time and size of the image is related to the magnitude of the Gaussian radius, and a rough approximate scenario is that the algorithm (1) takes a minute, and the algorithm (2) takes a millisecond to a second time. The Visible Algorithm (2) is faster than the algorithm (1), but the algorithm (2) has higher space requirements than the algorithm (1).
Visible, the algorithm (2) relative to the algorithm (1), the Gaussian radius is constant, both are relative to the image size of the linear algorithm, the difference is the size of the constant coefficient is different, the former is the Gaussian radius (template size) of the square level, the latter is the Gaussian radius (template size) of the linear level. This improvement, very similar to what I have previously given in a blog, is an improved algorithm for an oil painting effect filter, which also improves the algorithm speed by reducing the constant coefficient from the square level of the template size to the linear level.
Theoretically, the Gaussian template is a two-dimensional surface with unbounded and infinitely expanded, and when implemented, it is necessary to truncate a finite two-dimensional template for this surface. In order to improve the speed of the algorithm, we adopt a Gaussian fuzzy, which is a one-dimensional template. So according to the normal distribution as shown:
Figure 1. The contribution ratio of normal distribution
This figure is from reference [1], which, according to the information article, is actually derived from:
Http://zh.wikipedia.org/wiki/File:Standard_deviation_diagram.svg.
As you can see, the contribution outside the 3σ is very small, at 0.1, so when we truncate the template, the template boundary is defined as 3 *σ;
The Ivigos template is calculated as a formula:
The visualization of Ivigos is given, and the visualization method is to generate a Ivigos template based on the above formula and template boundary, then take a scaling factor F = 255/template center point data, with this scaling factor to scale the template data, and then draw a grayscale image, The brightness of the center point is then raised to the brightest. In a visualization, each cell corresponds to a template data with a cell size of 8 * 8 or 16 * 16 pixels.
The left side is a common 3 x 3 blur template (σ= 0.849), and the floating-point data of 3 * 3 around its center point is:
Sigma = 0.849:0.055 0.110 0.0550.110 0.221 0.1100.055 0.110 0.055
To complete the Gaussian blur, a Gaussian blur of two directions is required for the image. For example, the image is blurred horizontally, the intermediate result is obtained, and then the middle result is blurred vertically, which results in the final result. is a demo diagram that gives the result of a single Gaussian blur in two directions, and the final result:
Only in this example of the image, I put the results of the algorithm I wrote, opened in Photoshop and the Gaussian blur processing results from Photoshop to do a difference in comparison, found that the two are the same.
The DEMO interface I implemented is as follows:
By clicking the menu-visualization-Ivigos template you can generate a grayscale image on the right, which is the visual result of the Ivigos template.
At the bottom there is a control panel, the above can choose Gaussian fuzzy algorithm parameters, the meaning of the Gaussian radius and the radius of the same meaning in Photoshop, are σ in the algorithm.
In the algorithm parameters:
(1) Support multithreading, according to my observation, the number of threads set to the same number of CPU cores is more appropriate. The number of threads is more than the number of CPU cores, which is meaningless because the CPU is running full load when the algorithm executes. Turn on more threads, and you can no longer increase the speed.
Assuming that the CPU core number is p, the number of threads opened >= p, then the algorithm speed is approximately p times of single-threaded processing. (When the CPU is full, the number of threads gets bigger, and it doesn't make sense to increase speed.)
(2) floating-point type: Supports float and double. It is the type of data of the Gaussian template, and also the data type when the pixel weighting is added, according to my observation, the speed of float and double is not very different. Basically the same.
(3) Gaussian radius: Σ. The constant coefficient of the algorithm is O (σ). Obviously, the greater the value of σ, the longer the algorithm will take.
I also tried to cache and table-check the results of the 255 grayscale values * Template data when I implemented the algorithm, but found that it didn't improve the speed effectively, so I finally gave up the method. This may be because the algorithm's calculation is just a floating-point multiplication, and the read action on the data cannot be done faster than the floating-point multiplication. So it's not necessary to use the cache here.
In this DEMO, the filter processing is placed in the UI thread, which makes the filter processing time longer (such as the Gaussian radius value is very large, the picture is very large), the interface will have some cards, you can put the filter processing action in a new background thread to execute. This is relatively easy to achieve.
In a C + + program, using the algorithm I wrote is very simple, for example:
#include <GaussBlurFilter.hpp>Cgaussblurfilter<Double>_filter;_filter. Setsigma (3.5);//set the Gaussian radius_filter. Setmultithreads (true,4);//turn on multi-threading, user recommended number of threads is 4;//Lpsrcbits/lpdestbits: The starting address of the pixel data, must be aligned with 4 bytes,
Note: Lpbits must be the address of the pixel with the lowest address value in all pixels, regardless of whether the height is positive or negative. //bmwidth, bmheight: Image width and height (pixels), height allowed negative;//bpp: Bit depth, support 8 (grayscale), 24 (True color),_filter. Filter (Lpsrcbits, Lpdestbits, Bmwidth, Bmheight, BPP);
It is important to note that in multithreading, I use the Windows API (such as CreateThread), which makes GAUSSBLURFILTER.HPP currently available only on the Windows platform, and if it is to be used on other platforms, it should be modified with multithreading-related APIs Function call.
The height value can be either positive or negative, but the address lpbits of the pixel data must be the address of the pixel with the lowest address value in all pixels. That is, assuming the coordinates of the upper-left corner of the picture are the origin, and if the picture height is positive (bottom-up), then Lpbits is the address of the lower-left pixel (col = 0,row = height-1). If the picture height is negative (top-down), the lpbits is the address of the upper-left pixel (col = 0,row = 0). The scan line width of the image data must be aligned with 4 Bytes, that is, the scan line width is computed by the following formula:
int stride = (Bmwidth * BPP + 31)/32 * 4; //Scan line width, align to 4 Bytes
(The above formula is expressed in programming language, non-mathematical expression, that is, the truncation of fractional parts by integer division.) )
"Related downloads":
(1) Demo algorithm code files and executables (including GAUSSBLURFILTER.HPP and Windows System executable program): Gaussblurdemo_bin.zip
(2) Full source of Demo (including executable files and GAUSSBLURFILTER.HPP): Gaussblurdemo_src.zip
Resources
[1]. The implementation and optimization of Gaussian fuzzy algorithm;
[note] The formula in this article is generated using the following URL:http://www.codecogs.com/latex/eqneditor.php.
C + + implementation of Gaussian fuzzy algorithm