Autotuning NVCC Compiler Parameters, published @ CCPE Journal
LGPL-3.0 License
Command to execute the experiments:
./scripts/run_all.py -cp "-I /usr/local/cuda/include -L /usr/local/cuda/lib64 "
If the libraries are in the same locations
CLTune: An automatic OpenCL & CUDA kernel tuner
Par4All is an automatic parallelizing and optimizing compiler (workbench) for C and Fortran seque...
HPC solver for nonlinear optimization problems
SDK for GPU accelerated genome assembly and analysis
A CUDA Extension of Neural Network Libraries
Playing with CUDA and GPUs in Google Colab
GPU PyTorch TOP in TouchDesigner with CUDA-enabled OpenCV
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
cuda编程学习入门
Templated C++/CUDA implementation of Model Predictive Path Integral Control (MPPI)
Provides an environment for compiling TensorFlow or PyTorch with CUDA for aarch64 on an x86 machi...
Python library for fast time-series analysis on CUDA GPUs
Some CUDA design patterns and a bit of template magic for CUDA
Classes enabling finmath-lib to run its Monte-Carlo models on Cuda GPUs
Kernel Tuner