Robert Shaw

LLM Inference @neuralmagic

GitHub
Twitter
Location: Boston

Ecosystems: PyTorch, Llama, Cuda

Projects

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python - Released: 09 Feb 2023 - 28,039