cappr | Llama Ecosystem Directory

Bot releases are hidden (Show)

cappr - v0.6.1 - fix openai.token_logprobs

Published by kddubey about 1 year ago

Breaking changes

cappr.openai.token_logprobs now prepends a space to each text by default. Set end_of_prompt="" if you don't want that

New features

None

Bug fixes

cappr.openai's (still highly experimental) discount feature works for a wider range of completions

cappr - v0.6.0 - HF no-batching module

Published by kddubey about 1 year ago

Breaking changes

None

New features

To minimize memory usage, use cappr.huggingface.classify_no_batch. See this section of the docs. I ended up needing this feature to demo Mistral 7B on a T4 GPU

Bug fixes

show_progress_bar=False now works, my b

cappr - v0.5.1 - allow installation with no dependencies

Published by kddubey about 1 year ago

Breaking changes

None

New features

See this section of the docs

Bug fixes

None

cappr - v0.5.0 - support GGUF models using llama-cpp-python

Published by kddubey about 1 year ago

Breaking changes

completions is not allowed to be an empty sequence

New features

Use GGUF models using the cappr.llama_cpp.classify module. Install using:
```
pip install "cappr[llama-cpp]"
```
See this section of the docs. See this demo for an example.

Bug fixes

None

cappr - v0.4.7 - breaking little things

Published by kddubey about 1 year ago

Breaking changes

end_of_prompt is restricted to be a whitespace, ” “, or empty string, ””. After much thought and experimentation, I realized that anything else is unnecessarily complicated
The OpenAI API model gpt-3.5-turbo-instruct has been deprecated b/c their API won’t allow setting echo=True, logprobs=1 starting tomorrow
The keyword argument for the (still highly experimental) discount feature, log_marginal_probs_completions, has been renamed to log_marg_probs_completions

New features

You can input your OpenAI API key dynamically: api_key=
The User Guide is much better

Bug fixes

None

cappr - v0.4.6 - support more types of sequence inputs

Published by kddubey about 1 year ago

Breaking changes

None

New features

Input checks on prompts and completions are more accurate. You can now input, e.g., a polars or pandas Series of strings

Bug fixes

None

cappr - v0.4.5 - niceties

Published by kddubey about 1 year ago

Breaking changes

There are stronger input checks to avoid silent failures. prompts cannot be empty. completions cannot be empty or a pure string (it has to be a sequence of strings)

New features

Pass normalize=False when you want raw, unnormalized probabilities for, e.g., multi-label classification applications
You can input a single prompt string or Example object. You no longer have to wrap it in a list and then unwrap it
You can disable progress bars using show_progress_bar=False
cappr.huggingface type-hints the model as a PreTrainedModelForCausalLM for greater clarity

Bug fixes

cappr.huggingface doesn't modify the model or tokenizer anymore, sorry bout that
The jagged/inhomogenous numpy array warning from earlier numpy versions (when using _examples functions) is correctly handled

cappr - v0.4.0 - HF single-token speedup, token_logprobs, discount feature

Published by kddubey about 1 year ago

Breaking changes

None

New features

cappr.huggingface is faster when all of the completions are single tokens. Specifically, we just do inference once on the prompts, and don't repeat data unnecessarily
cappr.huggingface implements token_logprobs like cappr.openai did
cappr.huggingface now supports the (highly experimental) discount feature (mentioned at the bottom of this answer) like cappr.openai did

Bug fixes

None

cappr - v0.3.0 - support Llama and Llama 2

Published by kddubey about 1 year ago

Breaking changes

None

New features

cappr.huggingface now supports Llama and Llama 2 (chat, raw, GPTQd)

Bug fixes

None

cappr - v0.2.6 - deprecate model string as input to HF functions

Published by kddubey over 1 year ago

Breaking changes

cappr.huggingface functions only allow model_and_tokenizer input, not the string model input.

New features

None

Bug fixes

Correct type hint for predict_proba_examples functions to reflect that the 2nd dimension is always an array.

cappr - v0.2.5 - add prior kwarg to HF no-cache functions

Published by kddubey over 1 year ago

Breaking changes

None

New features

None

Bug fixes

cappr.huggingface.classify.predict_proba and cappr.huggingface.classify.predict now accept a prior kwarg, as was intended (I just forgot to add it in).

cappr - v0.2.4 - fix token slicing

Published by kddubey over 1 year ago

Breaking changes

None

New features

None

Bug fixes

For OpenAI models, the completion token probabilities should actually be sliced based on the tokenization of end_of_prompt + completion, not just completion. Based on a few experiments, this change doesn't impact statistical performance. But it should be fixed ofc.

cappr - v0.2.3 - allow pre-computed completion log-probs for the discount feature

Published by kddubey over 1 year ago

Breaking changes

None

New features

Allow for pre-computed completion log-probs for the experimental discount feature. Use the newly surfaced function, cappr.openai.token_logprobs, to compute them once and re-use them.

Bug fixes

None

cappr - v0.2.2 - highly experimental discount feature

Published by kddubey over 1 year ago

Breaking changes

Deprecate cappr.utils.classify.agg_log_probs_from_constant_completions. I doubt anyone was using this. If you were, then use cappr.utils.classify.agg_log_probs from now on (it does the exact same thing).

New features

Highly experimental feature which discounts completions by their marginal probability. See my updated answer here. The plan is to evaluate this method more thoroughly and discuss it in the user guide. For now, feel free to mess with it.

Bug fixes

Fix type hint for tokenizer: AutoTokenizer to PreTrainedTokenizer.

cappr - v0.2.1 - allow prior to be a numpy array

Published by kddubey over 1 year ago

Breaking changes

None

New features

None

Bug fixes

Allow prior to be a numpy array

cappr - v0.2.0 - add HF no-cache module

Published by kddubey over 1 year ago

Breaking changes

None

New features

Adds cappr.huggingface.classify_no_cache, which appears to be faster for non-batch processing. This may be a bug tho lol. If it is and I fix it, I'm going to hide this module again, which will be a breaking change.

Here's its documentation.

Bug fixes

None

cappr - v0.1.0 - first release

Published by kddubey over 1 year ago

See the documentation

Installation

If you intend on using OpenAI models, sign up for the OpenAI API here, and then set the environment variable OPENAI_API_KEY. For zero-shot classification, OpenAI models are currently far ahead of others. But using them will cost ya 💰!

Install with pip:

python -m pip install cappr

python -m pip install cappr[hf]

python -m pip install cappr[demos]

Package Rankings

Top 18.73% on Pypi.org

Badges

Extracted from project README

Related Projects

LLM-project

My Retrieval Augmented Generation project for LLM Zoomcamp. A RAG app for Azure foundation course...

14 Sep 2024 1

open-strawberry

Building open version of OpenAI o1 via reasoning traces (Groq, ollama, Anthropic, Gemini, OpenAI,...

15 Sep 2024 69

BambooAI

A lightweight library that leverages Language Models (LLMs) to enable natural language interactio...

07 May 2023 439

empower-functions

GPT-4 level function calling models for real-world tool using use cases

10 May 2024 204

ax

The unofficial DSPy framework. Build LLM powered Agents and "Agentic workflows" based on the Stan...

23 Feb 2023 1,058

Webscout

Search for anything using the Google, DuckDuckGo, phind.com. Also containes AI models, can transc...

27 Feb 2024 24

ComfyUI-N-Nodes

A suite of custom nodes for ConfyUI that includes GPT text-prompt generation, LoadVideo, SaveVide...

06 Aug 2023 201

rust-genai

Rust multiprovider generative AI client (Ollama, OpenAi, Anthropic, Groq, Gemini, Cohere, ...)

01 Jun 2024 182

genaiscript

Generative AI Scripting for VSCode

17 Aug 2023 56

SyntaxShaper

Powering Agent Chains by Constraining LLM Outputs

07 Mar 2024 9

ArogyaMitra

An accessible, reliable, and efficient platform for medical information and support using LLMs

22 Jul 2024 1

parrot.nvim

parrot.nvim 🦜 - the plugin that brings stochastic parrots to Neovim. This is a gp.nvim-fork focus...

15 Aug 2023 245

modelfusion

The TypeScript library for building AI applications.

25 May 2023 889

llama-api

An OpenAI-like LLaMA inference API

20 Jul 2023 111

megabots

🤖 State-of-the-art, production ready LLM apps made mega-easy, so you don't have to build them fro...

11 Apr 2023 343