self-instruct

Aligning pretrained language models with instruction data generated by themselves.

APACHE-2.0 License

Stars

4.1K

View Code on GitHub View on X

Ecosystems: Python

Issue Statistics

Past Year

All Time

Total Pull Requests

Merged Pull Requests

Total Issues

Time to Close Issues

3 months

20 days

Related Projects

SpikeGPT

Implementation of "SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks"

24 Feb 2023 729

GPTeacher

A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-...

02 Apr 2023 1,609

EasyInstruct

[ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.

07 Mar 2023 360

Megatron-LM

Ongoing research training transformer models at scale

21 Mar 2019 8,839

long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA a...

06 Jul 2023 1,448

HuatuoGPT

HuatuoGPT, Towards Taming Language Models To Be a Doctor. (An Open Medical GPT)

13 Apr 2023 1,051

instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings

17 Dec 2022 1,844

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

10 Mar 2023 28,912

prompt2model

prompt2model - Generate Deployable Models from Natural Language Instructions

27 Mar 2023 1,947

instructor

structured outputs for llms

14 Jun 2023 5,518

open-instruct

09 Jun 2023 1,214

LlamaAcademy

A school for camelids

19 Apr 2023 1,206

GLM

GLM (General Language Model)

18 Mar 2021 3,170

codealpaca

22 Mar 2023 1,414

ChatLM-mini-Chinese

中文对话0.2B小模型（ChatLM-Chinese-0.2B），开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sf...

27 Aug 2023 1,166