A C++11/OpenGL library for the Fast Fourier Transform
MIT License
GLFFT is a C++11/OpenGL library for doing the Fast Fourier Transform (FFT) on a GPU in one or two dimensions. The target APIs are OpenGL 4.3 core profile and OpenGL ES 3.1. GLFFT is implemented entirely with compute shaders.
The FFT has several uses in graphics. The two main ones are Tessendorf's FFT water simulation technique as well as applying massive convolution filters to images more efficiently.
GLFFT is a well featured single-precision and half-precision (FP16) FFT library designed for graphics use cases.
Desktop Linux and Windows have been tested and are supported. Supports both OpenGL 4.3 core and OpenGL ES 3.1.
The GLSL code has been tested and verified on:
Compilers tested:
GLFFT is a performance oriented FFT library, and aims to reach optimal efficiency on both desktop and mobile GPUs.
GLFFT is an OpenGL oriented implementation, but all API specifics of GLFFT have been abstracted away to make it suitable for integration into engines which might have abstracted interfaces to the underlying graphics APIs. It is possible to implement GLFFT without OpenGL, as long as GLSL is supported as a shading language, which is assumed to be feasible once SPIR-V becomes mainstream. The abstracted interface is designed to match the spirit of next-generation APIs like Vulkan and D3D12.
As using GLFFT in a GL context is by far the most common use case, a default, ready-to-go OpenGL interface
is supplied in the API. Due to the nature of OpenGL, it is not always feasible to build GLFFT as a standalone library, as applications
might have their own ways of getting to OpenGL symbols and headers which is not easy for GLFFT to work with.
Instead, the GLFFT implementation will include a user-supplied header, glfft_api_headers.hpp
which is responsible for including
the appropriate OpenGL or OpenGL ES 3.1 headers (or special headers like GLEW) the calling application uses,
as well as defining various constants, such as the GLSL language strings to use.
This header can also override various functions such as logging functions and time functions.
For a reference, see test/glfw/glfft_api_headers.hpp
as to how the internal test/bench suite does it.
To compile GLSL shader sources into header files, run:
make -C glsl
This only needs to be run when the internal GLSL shaders are modified.
To build source files, a $(wildcard *.cpp)
in top level directory of GLFFT is sufficient to find the necessary files.
When compiling, C++11 must be enabled, and glfft_api_headers.hpp
must be found in an include path.
#include "glfft.hpp"
#include "glfft_gl_interface.hpp"
#include <memory>
using namespace GLFFT;
using namespace std;
FFTOptions options; // Default initializes to something conservative in terms of performance.
options.type.fp16 = true; // Use mediump float (if GLES) in shaders.
options.type.input_fp16 = false; // Use FP32 input.
options.type.output_fp16 = true; // Use FP16 output.
options.type.normalize = true; // Normalized FFT.
GLContext context;
FFT fft(&context, 1024, 256, ComplexToComplex, Inverse, SSBO, SSBO, make_shared<ProgramCache>(), options);
GLuint output_ssbo, input_ssbo;
// Create GL_SHADER_STORAGE_BUFFERs and put some data in them.
// Adapt raw GL types to types which GLContext uses internally.
GLBuffer adaptor_output(output_ssbo);
GLBuffer adaptor_input(input_ssbo);
// Do the FFT
CommandBuffer *cmd = context.request_command_buffer();
fft.process(cmd, &adaptor_output, &adaptor_input);
context.submit_command_buffer(cmd);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
#include "glfft.hpp"
#include "glfft_wisdom.hpp"
#include "glfft_gl_interface.hpp"
#include <memory>
using namespace GLFFT;
using namespace std;
FFTOptions options; // Default initializes to something conservative in terms of performance.
options.type.fp16 = true; // Use mediump float (if GLES) in shaders.
options.type.input_fp16 = false; // Use FP32 input.
options.type.output_fp16 = true; // Use FP16 output.
options.type.normalize = true; // Normalized FFT.
GLContext context;
FFTWisdom wisdom;
// Use some static wisdom to make the learning step faster.
// Avoids searching for options which are known to be bogus for a particular vendor.
wisdom.set_static_wisdom(FFTWisdom::get_static_wisdom_from_renderer(reinterpret_cast<const char*>(glGetString(GL_RENDERER))));
// Learn how to do 1024x256 much faster!
wisdom.learn_optimal_options_exhaustive(&context, 1024, 256, ComplexToComplex, SSBO, SSBO, options.type);
GLContext context;
FFT fft(&context, 1024, 256, ComplexToComplex, Inverse, SSBO, SSBO, make_shared<ProgramCache>(), options);
GLuint output_ssbo, input_ssbo;
// Create GL_SHADER_STORAGE_BUFFERs and put some data in them.
// Adapt raw GL types to types which GLContext uses internally.
GLBuffer adaptor_output(output_ssbo);
GLBuffer adaptor_input(input_ssbo);
// Do the FFT
CommandBuffer *cmd = context.request_command_buffer();
fft.process(cmd, &adaptor_output, &adaptor_input);
context.submit_command_buffer(cmd);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
GLContext context;
FFTWisdom wisdom;
// Use some static wisdom to make the learning step faster.
// Avoids searching for options which are known to be bogus for a particular vendor.
wisdom.set_static_wisdom(FFTWisdom::get_static_wisdom_from_renderer(reinterpret_cast<const char*>(glGetString(GL_RENDERER))));
// Learn how to do 1024x256 much faster!
wisdom.learn_optimal_options_exhaustive(&context, 1024, 256, ComplexToComplex, SSBO, SSBO, options.type);
// Serialize to string.
string wisdom_json = wisdom.archive();
// Unserialize wisdom.
wisdom.extract(wisdom_json.c_str());
Proper documentation is still TODO. However, test/glfft_test.cpp
and test/glfft_cli.cpp
should give a good idea for how to use the API.
For verification and benchmarking, GLFFT has a standalone executable which uses the GLFW3 library to create a context.
Building requires GLFW3 to be installed, visible from pkg-config, although for Windows 64-bit a precompiled library with headers is included for convenience.
make
./glfft_cli help
make PLATFORM=win TOOLCHAIN_PREFIX=x86_64-w64-mingw32-
wine64 ./glfft_cli help
Other than GLFW, muFFT and rapidjson are needed, which are included as git submodules.
git submodule init
git submodule update
# Or git clone --recursive when checking out GLFFT.
To verify GLFFT correctness, GLFFT has a large amount of tests which exhaustively tests the GLFFT API with almost any thinkable input and output parameters.
./glfft_cli test --test-all # Exhaustively tests everything (over 3000 tests currently), so will take some time.
Verification is based on SNR compared to muFFT as a reference and maximum allowed delta from reference value. The default precision requirements are fairly stringent, so particular GPUs might not support high enough precision. Precision requirements can be overridden in such scenarios on the command line (see help).
To evaluate GLFFT performance, GLFFT can be benchmarked with various parameters. The benchmarking interface will run a full wisdom search first to find optimal parameters before running the actual benchmark.
./glfft_cli bench --width 1024 --height 1024 --type ComplexToComplex --input-texture # Benchmark a 1024x1024 C2C FFT with texture as input and SSBO as output. See ./glfft_cli bench help for more.
GLFFT implements radix-4, radix-8, radix-16 (radix-4 two times in single pass) and radix-64 (radix-8 two times in single pass) FFT kernels. GLFFT will automatically find the optimal subdivision of a larger FFT problem based on either wisdom knowledge or estimations. Radix-16 and Radix-64 kernels are implemented by using shared memory to perform multiple passes without going to global memory between the two passes.
In order to support a vast number of options, GLFFT will compile shaders on-demand during initialization and store them in a user-provided cache which can be shared between GLFFT instantiations.
GLFFT is out-of-place using the Stockham auto-sort algorithm to avoid explicit reorder passes which is common for in-place algorithms.
GLFFT's implementation is overall very similar to muFFT, and more information about the algorithms can be found here.
The GLFFT library is licensed under the permissive MIT license, see COPYING and preambles in source files for more detail. The verification and benchmarking library links against GLFW, whose license is here.