Count and truncate text based on tokens
APACHE-2.0 License
A utility library for dealing with token counting for messages sent to an LLM (currently OpenAI m...
Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Ongoing research training transformer language models at scale, including: BERT & GPT-2
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Plugin for LLM adding support for the GPT4All collection of models
cli tool to wrap ChatGPT Python API
Home of StarCoder2!
A wrapper around the stdlib `tokenize` which roundtrips.
SQL functions for calling OpenAI APIs
A lightweight but powerful library to build token indices for NLP tasks, compatible with major De...