✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024
CC-BY-4.0 License
Statistics for this project are still being loaded, please check back later.
Transfer Learning Library for Domain Adaptation, Task Adaptation, and Domain Generalization
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
Repository for Multimodal AutoML Benchmark
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-f...
Code for "A Comprehensive Empirical Evaluation on Online Continual Learning" ICCVW 2023 VCL Workshop
[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model...
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?
An open platform for training, serving, and evaluating large language models. Release repo for Vi...
Code accompanying the paper Pretraining Language Models with Human Preferences
🧪Yet Another ICU Benchmark: a holistic framework for the standardization of clinical prediction m...
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Com...
Measuring Massive Multitask Language Understanding | ICLR 2021
MTEB: Massive Text Embedding Benchmark
CodeGeeX2: A More Powerful Multilingual Code Generation Model