🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
APACHE-2.0 License
Easily add metrics to your code that actually help you spot and debug issues in production. Built...
Starter pack for NeurIPS LLM Efficiency Challenge 2023.
Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilitie...
[EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code
A framework for few-shot evaluation of language models.
Model zoo for different kinds of uncertainty quantification methods used in Natural Language Proc...
The LLM Evaluation Framework
Metric learning algorithms in Python
Evaluate your biometric verification models literally in seconds.
Tuning and Evaluation of RAG pipeline. (Automated optimization to be added soon)
OpenMMLab Foundational Library for Training Deep Learning Models
Torchmetrics - Machine learning metrics for distributed, scalable PyTorch applications.
Complete evaluation of traditional "SK-learn like" machine learning models for post-operative com...
Evaluating state of the art in AI
Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Reca...