More benchmarks of various fft implementations
GPL-3.0 License
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for...
Thrust, CUB, TBB, AVX2, CUDA, OpenCL, OpenMP, SyCL - all it takes to sum a lot of numbers fast!
Benchmarking Deep Learning operations on different hardware
Micro-benchmarks of hyperthreading
Some CUDA design patterns and a bit of template magic for CUDA
The fast Continuous Wavelet Transform (fCWT) is a library for fast calculation of CWT.
Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library
row-major matmul optimization
benchmark tooling that loves you ❤️
A small OpenCL benchmark program to measure peak GPU/CPU performance.
Fast Fourier Transform Frontend