My experiments with MPI and OpenMP
(2024/2025) A library and environment for parallel processing in a power-limited CPU+GPU cluster ...
CUDA C++ Core Libraries
Abstraction Library for Parallel Kernel Acceleration
cuda编程学习入门
Some CUDA design patterns and a bit of template magic for CUDA
Templated C++/CUDA implementation of Model Predictive Path Integral Control (MPPI)
Playing with CUDA and GPUs in Google Colab
Par4All is an automatic parallelizing and optimizing compiler (workbench) for C and Fortran seque...
HPC solver for nonlinear optimization problems
Compare the performance of matrix multiplication among GPU shared memory, GPU global memory and CPU
A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofl...
Python library for fast time-series analysis on CUDA GPUs
Simplex mesh adaptivity for HPC
Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
A curated list of awesome GPGPU (CUDA/OpenCL/Vulkan) resources