Quick Gaussian filter algorithm

Source: Internet
Author: User

Gaussian filtering is a lot of Image ProcessingAlgorithmIt is of great significance to implement the quick Gaussian filter algorithm through the most crucial intermediate step.

By reading previous documents about the Gaussian filter fast algorithm, we have achieved our own rapid Gaussian filter algorithm, which is nearly six times faster with the neon command.

1. Mean filtering approaches Gaussian filtering

This algorithm has the advantage of being simple. Generally, Gaussian filtering can be achieved through three mean filters. Of course, if you need a higher accuracy, You need to perform more mean filtering.Simple mean filtering determines the length of mean Filtering Based on the Delta parameter of Gaussian. Delta is continuous, while the length of mean filtering is an integer, resulting in approximation of different delta.There is a certain error. The implementation of O (1) time complexity of mean filtering requires an integral graph, which occupies a cache equivalent to the original image. In this way, mean filtering approaches Gaussian filtering.You need to calculate three integral graphs and traverse three integral graphs to calculate the mean value. Moreover, this algorithm cannot accelerate with the cpu simd command.

For the mean filtering algorithm that approaches Gaussian filtering and Its Accuracy Improvement, refer to the following documents:

Peter kovesi 2009, arbitrary Gaussian filtering with 25 addtions and 5 multiplications per pixel

2. Extended Binary filter approaching Gaussian filter

References for this algorithm are:

Extended binomial filter for fast Gaussian blur

The idea of this algorithm is still to use mean approximation, but the length of each mean filter is different, and it is a weighted mean. However, the integral graph is not used during implementation, but the mean value is calculated recursively.The references provideSource code, Its sourceCodeWritten for a plug-in language of Photoshop. It should be that I am too dull and have not thoroughly understood the details of the algorithm. There are several important aspects in its code.The initial parameters are unknown.

Based on my experience, the above algorithms are not actually faster than Recursive Filtering, because its code is used to recursively calculate the mean filtering in the vertical direction, the image data is accessed across rows. This algorithm cannot be accelerated using the SIMD command of the CPU.

3. IIR filter approaching Gaussian filter

References for this algorithm include:

Your ID deriche-"recursively implementing the Gaussian and its derivatives", 1993.

Lucas J. Van Vliet, Ian T. Young and Piet W. Verbeek-"recursive Gaussian derivative filters", 1998

Dave Hale, "recursive Gaussian filters", CWP-546

This algorithm is implemented by cascade two IIR filters, one of which is a non-causal IIR filter process.

In addition, the intel website has an article on the code that uses the SSE command to optimize the IIR recursion to approach Gaussian filter. However, the code is too complex to write, and non-professionals can also!

There are many programming techniques for implementing Gaussian filter by using IIR filter, especially when using the neon command to accelerate recursive IIR filter.

 

4. Performance Test of IIR recursive Gaussian filter

4.1 comparison with Photoshop cs5.0

& Lt; Col width = "388" & gt; & Lt; Col width = "248" & gt;
time processor
Photoshop cs5.0 Gaussian filter radius: 250 x pixel color photo 1.5 seconds intel I3 CPU clock speed 2.3 GHz 2 GB memory
my Gaussian filter radius 250 pixel color photos 1 second intel I3 CPU clock speed 2.3 GHz 2 GB memory

Note: The Gaussian filter of PS cs5 performs different optimizations on Gaussian filter with different radius. It is extremely fast when the radius is very small and does not rule out multi-core optimization.The IR Recursive Filtering algorithm has the same radius calculation time.

4.2 Performance Test Data in the rvds Environment

Instruction count Number of cycles Processor
C code, 1024x768 grayscale image 110 m 194 m Arm-cortex A8
Neon command assembly code, 1024x768 grayscale image 34 m 35 m Arm-cortex A8
C code, 1024x768 Color Image 338 m 558 m Arm-cortex A8
Neon command assembly code, 1024x768 Color Image 84 m 85 m Arm-cortex A8

 

4.3 Test data on real iPod Devices

 

Time Device
Neon command assembly code, 720x576 grayscale image 25 ms Ipod4

4.4 memory consumption

Memory RAM consumption
C code 2 * max (width, height)
Neon command assembly code 8 * max (width, height)

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.