Completion After Prompt Probability. Make your LLM make a choice
APACHE-2.0 License
Bot releases are visible (Hide)
Published by kddubey about 1 year ago
cappr.openai.token_logprobs
now prepends a space to each text by default. Set end_of_prompt=""
if you don't want thatNone
cappr.openai
's (still highly experimental) discount feature works for a wider range of completions
Published by kddubey about 1 year ago
None
cappr.huggingface.classify_no_batch
. See this section of the docs. I ended up needing this feature to demo Mistral 7B on a T4 GPUshow_progress_bar=False
now works, my bPublished by kddubey about 1 year ago
Published by kddubey about 1 year ago
completions
is not allowed to be an empty sequenceUse GGUF models using the cappr.llama_cpp.classify
module. Install using:
pip install "cappr[llama-cpp]"
See this section of the docs. See this demo for an example.
None
Published by kddubey about 1 year ago
end_of_prompt
is restricted to be a whitespace, ” “
, or empty string, ””
. After much thought and experimentation, I realized that anything else is unnecessarily complicated
The OpenAI API model gpt-3.5-turbo-instruct
has been deprecated b/c their API won’t allow setting echo=True, logprobs=1
starting tomorrow
The keyword argument for the (still highly experimental) discount feature, log_marginal_probs_completions
, has been renamed to log_marg_probs_completions
You can input your OpenAI API key dynamically: api_key=
The User Guide is much better
None
Published by kddubey about 1 year ago
None
prompts
and completions
are more accurate. You can now input, e.g., a polars or pandas Series of stringsNone
Published by kddubey about 1 year ago
prompts
cannot be empty. completions
cannot be empty or a pure string (it has to be a sequence of strings)normalize=False
when you want raw, unnormalized probabilities for, e.g., multi-label classification applicationsExample
object. You no longer have to wrap it in a list and then unwrap itshow_progress_bar=False
cappr.huggingface
type-hints the model as a PreTrainedModelForCausalLM
for greater claritycappr.huggingface
doesn't modify the model or tokenizer anymore, sorry bout that_examples
functions) is correctly handledPublished by kddubey about 1 year ago
None
cappr.huggingface
is faster when all of the completions are single tokens. Specifically, we just do inference once on the prompts, and don't repeat data unnecessarilycappr.huggingface
implements token_logprobs
like cappr.openai
didcappr.huggingface
now supports the (highly experimental) discount feature (mentioned at the bottom of this answer) like cappr.openai
didNone
Published by kddubey about 1 year ago
None
cappr.huggingface
now supports Llama and Llama 2 (chat, raw, GPTQd)None
Published by kddubey over 1 year ago
cappr.huggingface
functions only allow model_and_tokenizer
input, not the string model
input.None
predict_proba_examples
functions to reflect that the 2nd dimension is always an array.Published by kddubey over 1 year ago
None
None
cappr.huggingface.classify.predict_proba
and cappr.huggingface.classify.predict
now accept a prior
kwarg, as was intended (I just forgot to add it in).Published by kddubey over 1 year ago
None
None
end_of_prompt + completion
, not just completion
. Based on a few experiments, this change doesn't impact statistical performance. But it should be fixed ofc.Published by kddubey over 1 year ago
None
cappr.openai.token_logprobs
, to compute them once and re-use them.None
Published by kddubey over 1 year ago
cappr.utils.classify.agg_log_probs_from_constant_completions
. I doubt anyone was using this. If you were, then use cappr.utils.classify.agg_log_probs
from now on (it does the exact same thing).AutoTokenizer
to PreTrainedTokenizer
.Published by kddubey over 1 year ago
None
None
prior
to be a numpy arrayPublished by kddubey over 1 year ago
None
Adds cappr.huggingface.classify_no_cache
, which appears to be faster for non-batch processing. This may be a bug tho lol. If it is and I fix it, I'm going to hide this module again, which will be a breaking change.
Here's its documentation.
None
Published by kddubey over 1 year ago
See the documentation
If you intend on using OpenAI models, sign up for the OpenAI API here, and then set the environment variable OPENAI_API_KEY
. For zero-shot classification, OpenAI models are currently far ahead of others. But using them will cost ya 💰!
Install with pip
:
python -m pip install cappr
python -m pip install cappr[hf]
python -m pip install cappr[demos]