quantization | Python Ecosystem Directory

Statistics for this project are still being loaded, please check back later.

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. D...

Implementation of NWT, audio-to-video generation, in Pytorch

Quantization-aware training with spiking neural networks

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantiz...

End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 ...

4 bits quantization of LLaMA using GPTQ

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Accessible large language models via k-bit quantization for PyTorch.

Explore training for quantized models

Implementation of Discrete Key / Value Bottleneck, in Pytorch

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 ...

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techn...

Repository for Unsupervised Sentence Compression using Denoising Auto-Encoders