lm-human-preference-details

RLHF implementation details of OAI's 2019 codebase

MIT License

Stars

146

View Code on GitHub View on X

Ecosystems: Python

Issue Statistics

Past Year

All Time

Total Pull Requests

Merged Pull Requests

Total Issues

Time to Close Issues

N/A

Related Projects

cleanba

CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL

12 Feb 2023 105

pretraining-with-human-feedback

Code accompanying the paper Pretraining Language Models with Human Preferences

20 Feb 2023 175

comfyui_controlnet_aux

ComfyUI's ControlNet Auxiliary Preprocessors

17 Aug 2023 2,127

chain-of-hindsight

Chain-of-Hindsight, A Scalable RLHF Method

20 Feb 2023 211

netharn

Parameterized fit and prediction harnesses for pytorch

31 Mar 2018 39

OmniBenchmark

[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning fr...

12 Jul 2022 105

OpenOOD

Benchmarking Generalized Out-of-Distribution Detection

29 Nov 2021 849

hybrid-discriminative-generative

Hybrid Discriminative-Generative Training via Contrastive Learning

17 Jul 2020 75

summarize_from_feedback_details

08 Jan 2024 105

deep-marl-toolkit

MARLToolkit: The Multi-Agent Rainforcement Learning Toolkit. Include implementation of MAPPO, MAD...

08 Aug 2022 70

t2v_metrics

Evaluating text-to-image/video/3D models with VQAScore

16 Dec 2023 192

ImageReward

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

01 Apr 2023 1,118

ProteinDT

05 Feb 2023 41

dust3r

DUSt3R: Geometric 3D Vision Made Easy

21 Feb 2024 5,154

indic_eval

A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse ra...

26 Mar 2024 31