Tutel MoE: An Optimized Mixture-of-Experts Implementation
MIT License
To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention,...
Subseasonal forecasting models
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable...
Common PyTorch Modules
Building modular LMs with parameter-efficient fine-tuning.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using S...
Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documen...
MSCCL++: A GPU-driven communication stack for scalable AI applications
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Foundation Architecture for (M)LLMs
Generation of protein sequences and evolutionary alignments via discrete diffusion models
AICI: Prompts as (Wasm) Programs