lightweight simple profiling for python/pytorch
MIT License
GPU PyTorch TOP in TouchDesigner with CUDA-enabled OpenCV
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofl...
CLTune: An automatic OpenCL & CUDA kernel tuner
Some CUDA design patterns and a bit of template magic for CUDA
CUDA C++ Core Libraries
A library to create kaleidoscope effect on images with CUDA. You can build on all platforms using...
A tool for examining GPU scheduling behavior.
Kernel Tuner
A small utility for getting some info post-hoc about a program's run.
Abstraction Library for Parallel Kernel Acceleration
A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources
Python interface to GPU-powered libraries
A KMeans implemented in C++ with Python bindings and GPU acceleration
Python library for fast time-series analysis on CUDA GPUs
NumPy实现类PyTorch的动态计算图和神经网络框架(MLP, CNN, RNN, Transformer)