Inference Vision Transformer (ViT) in plain C/C++ with ggml
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
ONNX-TensorRT: TensorRT backend for ONNX
Fast inference engine for Transformer models
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
[WIP] NCNN with Vulkan implementation of GFPGAN aims at developing Practical Algorithms for Real-...
Streaming TTS based on Piper with optional RK3588 NPU support