blake3

SYCL accelerated BLAKE3 Hash Implementation

MIT License

Stars

View Code on GitHub View on X

Ecosystems: C++, Cuda

Statistics for this project are still being loaded, please check back later.

Related Projects

alpaka

Abstraction Library for Parallel Kernel Acceleration

05 Nov 2014 303

https://github.com/src-d/minhashcuda

Weighted MinHash implementation on CUDA (multi-gpu).

25 Oct 2016 114

cxbqn

BQN virtual machine

22 Oct 2021 29

LuisaCompute

High-Performance Rendering Framework on Stream Architectures

20 Nov 2020 636

peakperf

Achieve peak performance on x86 CPUs and NVIDIA GPUs

10 Mar 2018 58

Apriori-and-Eclat-Frequent-Itemset-Mining

Implementation of the Apriori and Eclat algorithms, two of the best-known basic algorithms for mi...

21 Oct 2018 40

cuda-design-patterns

Some CUDA design patterns and a bit of template magic for CUDA

16 Nov 2018 145

https://github.com/src-d/kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA

23 Jun 2016 797

https://github.com/romnn/nvbit-rs

Rust bindings to the NVIDIA NVBIT binary instrumentation API

08 Nov 2022 2

spbla

Sparse Boolean linear algebra for Nvidia Cuda, OpenCL and CPU computations

18 Feb 2021 14

cccl

CUDA C++ Core Libraries

17 Sep 2020 743

jpeggpu

Low-latency CUDA JPEG decoder by parallelizing Huffman decoding

20 Nov 2023 5

PTXprofiler

A simple profiler to count Nvidia PTX assembly instructions of OpenCL/SYCL/CUDA kernels for roofl...

11 Jan 2023 37

https://github.com/Bruce-Lee-LY/decoding_attention

Decoding Attention is specially optimized for multi head attention (MHA) using CUDA core for the ...

14 Aug 2024 14

nsimd

Agenium Scale vectorization library for CPUs and GPUs

10 Apr 2019 324