KenLM: Faster and Smaller Language Model Queries
OTHER License
Scalable Probabilistic Programming Library
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Inference slice of marian for bergamot's tiny11 models. Faster to compile, and wield. Fewer model...
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)