@anyscale @ray-project
A high-throughput and memory-efficient inference and serving engine for LLMs
Python - Released: 09 Feb 2023 - 28,039