Torch-based tool for quantizing high-dimensional vectors using additive codebooks
Statistics for this project are still being loaded, please check back later.
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. D...
Implementation of NWT, audio-to-video generation, in Pytorch
Quantization-aware training with spiking neural networks
Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantiz...
End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 ...
4 bits quantization of LLaMA using GPTQ
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Accessible large language models via k-bit quantization for PyTorch.
Explore training for quantized models
Implementation of Discrete Key / Value Bottleneck, in Pytorch
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 ...
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techn...
Repository for Unsupervised Sentence Compression using Denoising Auto-Encoders