VPTQ

VPTQ, A Flexible and Extreme low-bit quantization algorithm

MIT License

Stars

456

View Code on GitHub View on X

Ecosystems: TypeScript, Windows UI Library (WinUI), Playwright, VS Code Extension

Statistics for this project are still being loaded, please check back later.

Related Projects

tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

06 Aug 2021 716

zero-shot-scfoundation

04 Oct 2023 46

aici

AICI: Prompts as (Wasm) Programs

26 Sep 2023 1,916

goodpoints

A Python package for generating concise, high-quality summaries of a probability distribution

03 Nov 2021 39

MInference

[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse c...

22 May 2024 758

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using S...

25 Mar 2021 13,692

subseasonal_toolkit

Subseasonal forecasting models

27 Jul 2021 42

BioGPT

15 Aug 2022 4,292

BiDR

Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable...

28 Feb 2022 15

table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documen...

17 May 2021 2,228

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

23 Mar 2022 1,856

torchscale

Foundation Architecture for (M)LLMs

17 Nov 2022 3,006

VisTalk

A JavaScript toolkit for Natural Language-based Visualization Authoring

26 Jun 2022 38

promptbench

A unified evaluation framework for large language models

13 Jun 2023 2,407

VRL3

06 Nov 2022 32