Fast inference engine for Transformer models
MIT License
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs)...
In this repository, I will share some useful notes and references about deploying deep learning-b...