VPTQ

VPTQ, A Flexible and Extreme low-bit quantization algorithm

MIT License

Stars

397

View Code on GitHub View on X

Ecosystems: VS Code Extension

Bot releases are hidden (Show)

No releases found yet, please check back later.

Badges

Extracted from project README

Related Projects

table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documen...

17 May 2021 2,228

aici

AICI: Prompts as (Wasm) Programs

26 Sep 2023 1,916

zero-shot-scfoundation

04 Oct 2023 46

VRL3

06 Nov 2022 32

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using S...

25 Mar 2021 13,692

BiDR

Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable...

28 Feb 2022 15

MInference

[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse c...

22 May 2024 740

VisTalk

A JavaScript toolkit for Natural Language-based Visualization Authoring

26 Jun 2022 38

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

23 Mar 2022 1,856

tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

06 Aug 2021 716

promptbench

A unified evaluation framework for large language models

13 Jun 2023 2,407

BioGPT

15 Aug 2022 4,292

subseasonal_toolkit

Subseasonal forecasting models

27 Jul 2021 42

goodpoints

A Python package for generating concise, high-quality summaries of a probability distribution

03 Nov 2021 39

torchscale

Foundation Architecture for (M)LLMs

17 Nov 2022 3,006