deepeval

The LLM Evaluation Framework

APACHE-2.0 License

Downloads
172.4K
Stars
1.9K

Bot releases are hidden (Show)

deepeval - Synthetic Data, Caching, Benchmarks, and GEval improvement Latest Release

Published by penguine-ip 7 months ago

For deepeval's latest release v0.21.15, we release:

deepeval - Async Support for Prod

Published by penguine-ip 7 months ago

In deepeval v0.20.85:

deepeval - Conversational Metrics and Synthetic Data Generation

Published by penguine-ip 8 months ago

In DeepEval's latest release, there is now:

deepeval - Production Stability

Published by penguine-ip 8 months ago

For the newest release, deepeval now is now stable for production use:

  • reduced package size
  • separated functionality of pytest vs deepeval test run command
  • included coverage score for summarization
  • fix contextual precision node error
  • released docs for better transparency into metrics calculation
  • allows users to configure RAGAS metrics for custom embedding models: https://docs.confident-ai.com/docs/metrics-ragas#example
  • fixed bugs with checking for package updates
deepeval - Hugging Face and LlamaIndex integration

Published by penguine-ip 8 months ago

For the latest release, DeepEval:

deepeval - LLM-Evals now support all LangChain chatmodels

Published by penguine-ip 9 months ago

deepeval - ALL RAG Metrics now offers score reasoning, and a lot more.

Published by penguine-ip 10 months ago

In this release:

deepeval - Lots of new features

Published by penguine-ip 10 months ago

Lots of new features this release:

  1. JudgementalGPT now allows for different languages - useful for our APAC and European friends
  2. RAGAS metrics now supports all OpenAI models - useful for those running into context length issues
  3. LLMEvalMetric now returns a reasoning for its score
  4. deepeval test run now has hooks that call on test run completion
  5. evaluate now displays retrieval_context for RAG evaluation
  6. RAGAS metric now displays metric breakdown for all its distinct metrics
deepeval - Continuous Evaluation

Published by penguine-ip 11 months ago

Automatically integrated with Confident AI for continous evaluation throughout the lifetime of your LLM (app):

-log evaluation results and analyze metrics pass / fails
-compare and pick the optimal hyperparameters (eg. prompt templates, chunk size, models used, etc.) based on evaluation results
-debug evaluation results via LLM traces
-manage evaluation test cases / datasets in one place
-track events to identify live LLM responses in production
-add production events to existing evaluation datasets to strength evals over time

deepeval - Continuous Evaluation

Published by penguine-ip 11 months ago

Automatically integrated with Confident AI for continous evaluation throughout the lifetime of your LLM (app):

-log evaluation results and analyze metrics pass / fails
-compare and pick the optimal hyperparameters (eg. prompt templates, chunk size, models used, etc.) based on evaluation results
-debug evaluation results via LLM traces
-manage evaluation test cases / datasets in one place
-track events to identify live LLM responses in production
-add production events to existing evaluation datasets to strength evals over time

deepeval - Evaluate entire datasets

Published by penguine-ip 11 months ago

Mid-week bug fixes release with an extra feature:

deepeval - Judgemental GPT

Published by penguine-ip 11 months ago

In this release, deepeval has added support for:

  • JudgementalGPT, a dedicated LLM app developed by Confident AI to perform evaluations more robustly and accurately. JudgementalGPT provides a score and a reason for the score.
  • Parallel testing: execute test cases in parallel and speed up evaluation up to 100x.
deepeval -

Published by penguine-ip 11 months ago

deepeval -

Published by penguine-ip 12 months ago

deepeval -

Published by penguine-ip 12 months ago

deepeval -

Published by penguine-ip 12 months ago

deepeval -

Published by penguine-ip 12 months ago

deepeval - v0.20.12

Published by penguine-ip 12 months ago

deepeval - v0.20.11

Published by penguine-ip 12 months ago

deepeval - v0.20.10

Published by penguine-ip about 1 year ago