Maximize your usage of OpenAI models without hitting rate limits
GPL-3.0 License
A utility library for dealing with token counting for messages sent to an LLM (currently OpenAI m...
Code for my PyCon talk "Writing RESTful Web Services with Flask"
Easy rate-limiting for python requests
An extension that provides rate limiting for Flask routes.
⚡️ Python client for the unofficial ChatGPT API with auto token regeneration, conversation tracki...
Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)
⏲️ Easy rate limiting for Python using a token bucket algorithm, with async and thread-safe decor...
LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!