RLHF implementation details of OAI's 2019 codebase
MIT License
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
Code accompanying the paper Pretraining Language Models with Human Preferences
ComfyUI's ControlNet Auxiliary Preprocessors
Chain-of-Hindsight, A Scalable RLHF Method
Parameterized fit and prediction harnesses for pytorch
[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning fr...
Benchmarking Generalized Out-of-Distribution Detection
Hybrid Discriminative-Generative Training via Contrastive Learning
MARLToolkit: The Multi-Agent Rainforcement Learning Toolkit. Include implementation of MAPPO, MAD...
Evaluating text-to-image/video/3D models with VQAScore
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
DUSt3R: Geometric 3D Vision Made Easy
A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse ra...