VPTQ, A Flexible and Extreme low-bit quantization algorithm
MIT License
A JavaScript toolkit for Natural Language-based Visualization Authoring
[NeurIPS'24 Spotlight] To speed up Long-context LLMs' inference, approximate and dynamic sparse c...
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documen...
A Python package for generating concise, high-quality summaries of a probability distribution
A unified evaluation framework for large language models
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using S...
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Foundation Architecture for (M)LLMs
Tutel MoE: An Optimized Mixture-of-Experts Implementation
AICI: Prompts as (Wasm) Programs
Subseasonal forecasting models
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable...