MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
APACHE-2.0 License
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
MSCCL++: A GPU-driven communication stack for scalable AI applications
AI-Sentry: A lightweight, pluggable facade layer for Azure Open AI, addressing common cross-cutti...
A lightning fast Finite State machine and REgular expression manipulation library.
Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable...
To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention,...
Official Codebase for MEGAVERSE: (published in ACL: NAACL 2024)
Generation of protein sequences and evolutionary alignments via discrete diffusion models
AICI: Prompts as (Wasm) Programs
In Greek mythology, Chiron is a wise centaur known for his knowledge of medicine and healing.
Subseasonal forecasting models
Foundation Architecture for (M)LLMs
Building modular LMs with parameter-efficient fine-tuning.