Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

APACHE-2.0 License

Downloads

Stars

2.2K

View Code on GitHub Visit Website

Ecosystems: Jupyter Notebook

Bot releases are hidden (Show)

Medusa - Medusa-v0.1 Latest Release

Published by harveyp123 about 1 year ago

Medusa is a easy-to-use framework that democratizes the acceleration techniques for LLM generation. Medusa-v0.1 uses several extra light-weighted decoding head, and exclude the need for draft model.

Package Rankings

Top 38.26% on Pypi.org

Related Projects

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

17 Jun 2023 37,645