openai-messages-token-helper

A helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model. Currently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision). Uses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.

Installation

Install the package:

python3 -m pip install openai-messages-token-helper

Usage

The library provides the following functions:

`build_messages`

Build a list of messages for a chat conversation, given the system prompt, new user message, and past messages. The function will truncate the history of past messages if necessary to stay within the token limit.

Arguments:

model (str): The model name to use for token calculation, like gpt-3.5-turbo.
system_prompt (str): The initial system prompt message.
tools (List[openai.types.chat.ChatCompletionToolParam]): (Optional) The tools that will be used in the conversation. These won't be part of the final returned messages, but they will be used to calculate the token count.
tool_choice (openai.types.chat.ChatCompletionToolChoiceOptionParam): (Optional) The tool choice that will be used in the conversation. This won't be part of the final returned messages, but it will be used to calculate the token count.
new_user_content (str | List[openai.types.chat.ChatCompletionContentPartParam]): (Optional) The content of new user message to append.
past_messages (list[openai.types.chat.ChatCompletionMessageParam]): (Optional) The list of past messages in the conversation.
few_shots (list[openai.types.chat.ChatCompletionMessageParam]): (Optional) A few-shot list of messages to insert after the system prompt.
max_tokens (int): (Optional) The maximum number of tokens allowed for the conversation.
fallback_to_default (bool): (Optional) Whether to fallback to default model/token limits if model is not found. Defaults to False.

Returns:

list[openai.types.chat.ChatCompletionMessageParam]

Example:

from openai_messages_token_helper import build_messages

messages = build_messages(
    model="gpt-35-turbo",
    system_prompt="You are a bot.",
    new_user_content="That wasn't a good poem.",
    past_messages=[
        {
            "role": "user",
            "content": "Write me a poem",
        },
        {
            "role": "assistant",
            "content": "Tuna tuna I love tuna",
        },
    ],
    few_shots=[
        {
            "role": "user",
            "content": "Write me a poem",
        },
        {
            "role": "assistant",
            "content": "Tuna tuna is the best",
        },
    ]
)

`count_tokens_for_message`

Counts the number of tokens in a message.

Arguments:

model (str): The model name to use for token calculation, like gpt-3.5-turbo.
message (openai.types.chat.ChatCompletionMessageParam): The message to count tokens for.
default_to_cl100k (bool): Whether to default to the CL100k token limit if the model is not found.

Returns:

int: The number of tokens in the message.

Example:

from openai_messages_token_helper import count_tokens_for_message

message = {
    "role": "user",
    "content": "Hello, how are you?",
}
model = "gpt-4"
num_tokens = count_tokens_for_message(model, message)

`count_tokens_for_image`

Count the number of tokens for an image sent to GPT-4-vision, in base64 format.

Arguments:

image (str): The base64-encoded image.

Returns:

int: The number of tokens used up for the image.

Example:


Count the number of tokens for an image sent to GPT-4-vision:

```python
from openai_messages_token_helper import count_tokens_for_image

image = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA..."
num_tokens = count_tokens_for_image(image)

`get_token_limit`

Get the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).

Arguments:

model (str): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).
default_to_minimum (bool): Whether to default to the minimum token limit if the model is not found.

Returns:

int: The token limit for the model.

Example:

from openai_messages_token_helper import get_token_limit

model = "gpt-4"
max_tokens = get_token_limit(model)

Package Rankings

Top 36.11% on Pypi.org

Related Projects

bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models

29 Sep 2017 1,620

docGPT-langchain

🔐Free GPT-3.5 chat with your docs (PDF, WORD, CSV, TXT)

03 Jul 2023 236

TelegramGPT

A minimum ChatGPT Telegram Bot with voice messages and custom system prompt support

02 Mar 2023 43

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

29 Jan 2024 6,019

chatgpt-prompt-wrapper

cli tool to wrap ChatGPT Python API

07 Apr 2023 8

BambooAI

A lightweight library that leverages Language Models (LLMs) to enable natural language interactio...

07 May 2023 439

Venus-Chub-Wrapper

A Wrapper for https://venus.chub.ai to create accounts and access premium LLMs using local Python...

15 Jun 2024 0

ttok

Count and truncate text based on tokens

18 May 2023 263

functionary

Chat language model that can use tools and interpret the results

11 Jul 2023 1,372