A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
APACHE-2.0 License
Py-TXI is a Python wrapper around Text-Generation-Inference and Text-Embedding-Inference that enables creating and running TGI/TEI instances through the awesome docker-py
in a similar style to Transformers API.
pip install py-txi
Py-TXI is designed to be used in a similar way to Transformers API. We use docker-py
(instead of a dirty subprocess
solution) so that the containers you run are linked to the main process and are stopped automatically when your code finishes or fails.
Here's an example of how to use it:
from py_txi import TGI, TGIConfig
llm = TGI(config=TGIConfig(model_id="bigscience/bloom-560m", gpus="0"))
output = llm.generate(["Hi, I'm a language model", "I'm fine, how are you?"])
print("LLM:", output)
llm.close()
Output: LLM: [' student. I have a problem with the following code. I have a class that has a method that', '"\n\n"I\'m fine," said the girl, "but I don\'t want to be alone.']
from py_txi import TEI, TEIConfig
embed = TEI(config=TEIConfig(model_id="BAAI/bge-base-en-v1.5"))
output = embed.encode(["Hi, I'm an embedding model", "I'm fine, how are you?"])
print("Embed:", output)
embed.close()
Output: [array([[ 0.01058742, -0.01588806, -0.03487622, ..., -0.01613717, 0.01772875, -0.02237891]], dtype=float32), array([[ 0.02815401, -0.02892136, -0.0536355 , ..., 0.01225784, -0.00241452, -0.02836569]], dtype=float32)]
That's it! Now you can write your Python scripts using the power of TGI and TEI without having to worry about the underlying Docker containers.