Par4All is an automatic parallelizing and optimizing compiler (workbench) for C and Fortran sequential programs
OTHER License
Python library for fast time-series analysis on CUDA GPUs
A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources
Simplex mesh adaptivity for HPC
My experiments with MPI and OpenMP
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofl...
Execute a subset of Python on HPC platforms
Abstraction Library for Parallel Kernel Acceleration
(2024/2025) A library and environment for parallel processing in a power-limited CPU+GPU cluster ...
Some CUDA design patterns and a bit of template magic for CUDA
Autotuning NVCC Compiler Parameters, published @ CCPE Journal
CUDA C++ Core Libraries
cuda编程学习入门
HPC solver for nonlinear optimization problems
Templated C++/CUDA implementation of Model Predictive Path Integral Control (MPPI)
BQN virtual machine