The PennyLane-Lightning plugin provides a fast state-vector simulator written in C++ for use with PennyLane
A collection of GICP-based fast point cloud registration algorithms
CURandRTC is a GPU random number generation module based on ThrustRTC
AutoDock for GPUs and other accelerators
Matrix multiplication example performed with OpenMP, OpenACC, BLAS, cuBLABS, and CUDA
VUDA is a header-only library based on Vulkan that provides a CUDA Runtime API interface for writing GPU-accelerated applications
The fastest way to compute matrix profiles on CPU and GPU!
CPU and CUDA implementation of Full Exhaustive Block Matching Algorithm using Integral Images
RAND library for HIP programming language
ThunderGBM: Fast GBDTs and Random Forests on GPUs