Reprinted please indicate the source for the klayge game engine, this article address for http://www.klayge.org/2012/04/12/%e5%9f%ba%e4%ba%8epixel-shader%e7%9a%84fft%e5%b7%b2%e7%bb%8f%e5% AE %8c%e6%88%90/
The gpu fft in GPU gems 2 was implemented in klayge last weekend. After optimization and adjustment, it has entered the klayge development version last night. The complete FFT lens effects will soon be integrated.
The method 1 mentioned in the article is used here, because after testing, method 2 is slower than method 1 on modern GPUs. The improvement I made was to merge the three original search tables into one, and save the input and output data in the format of 16f instead of 32f. On gtx580, 512x512 of the data volume, the PS version of FFT takes about MS, which can reach 75 times the CPU fftw speed. But even so, it is still a little slow for applications like lens effect. So next we will consider using compute shader to implement FFT, and the number of pass will be reduced to 1/3. PS processes 2 data records each time. 512x512 requires logs (512) + logs (512) = 18 Pass records. CS can process 8 data records each time, so only 6 Pass records are required. In the ocean example, there is a cs4 FFT, which requires two more texture and buffer transfer pass. The FFT of cs5 will be cleaner and more efficient.