chain-of-hindsight

Chain-of-Hindsight, A Scalable RLHF Method

APACHE-2.0 License

Stars

211

View Code on GitHub View on X

Ecosystems: Python

Statistics for this project are still being loaded, please check back later.

Related Projects

ProteinDT

05 Feb 2023 41

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vi...

19 Mar 2023 36,628

LLM-Blender

[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently sup...

31 May 2023 869

hybrid-discriminative-generative

Hybrid Discriminative-Generative Training via Contrastive Learning

17 Jul 2020 75

GLM

GLM (General Language Model)

18 Mar 2021 3,170

minichatgpt

minichatgpt - To Train ChatGPT In 5 Minutes

23 Feb 2023 155

long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA a...

06 Jul 2023 1,448

open-instruct

09 Jun 2023 1,214

indic_eval

A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse ra...

26 Mar 2024 31

sentiment-discovery

Unsupervised Language Modeling at scale for robust sentiment classification

30 Nov 2017 1,061

lm-evaluation-harness

A framework for few-shot evaluation of language models.

28 Aug 2020 6,569

pretraining-with-human-feedback

Code accompanying the paper Pretraining Language Models with Human Preferences

20 Feb 2023 175

GPT4RoI

GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

06 Jul 2023 365

starcoder

Home of StarCoder: fine-tuning & inference!

24 Apr 2023 7,267

multi_token

Embed arbitrary modalities (images, audio, documents, etc) into large language models.

11 Oct 2023 175