VPTQ, A Flexible and Extreme low-bit quantization algorithm
MIT License
Bot releases are hidden (Show)
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documen...
AICI: Prompts as (Wasm) Programs
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using S...
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable...
[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse c...
A JavaScript toolkit for Natural Language-based Visualization Authoring
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Tutel MoE: An Optimized Mixture-of-Experts Implementation
A unified evaluation framework for large language models
Subseasonal forecasting models
A Python package for generating concise, high-quality summaries of a probability distribution
Foundation Architecture for (M)LLMs