Code-LMs

Guide to using pre-trained large language models of source code

MIT License

Stars

1.8K

View Code on GitHub View on X

Ecosystems: Python

Issue Statistics

Past Year

All Time

Total Pull Requests

Merged Pull Requests

Total Issues

Time to Close Issues

N/A

7 days

Related Projects

GLM

GLM (General Language Model)

18 Mar 2021 3,170

Megatron-LM

Ongoing research training transformer models at scale

21 Mar 2019 8,839

sentiment-discovery

Unsupervised Language Modeling at scale for robust sentiment classification

30 Nov 2017 1,061

BambooAI

A lightweight library that leverages Language Models (LLMs) to enable natural language interactio...

07 May 2023 439

TransCoder

Public release of the TransCoder research project https://arxiv.org/pdf/2006.03511.pdf

10 Jul 2020 1,688

ChatLM-mini-Chinese

中文对话0.2B小模型（ChatLM-Chinese-0.2B），开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sf...

27 Aug 2023 1,166

starcoder2

Home of StarCoder2!

08 Dec 2023 1,732

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

02 Jul 2021 1,323

CodeGeeX

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

17 Sep 2022 8,154

lm-evaluation-harness

A framework for few-shot evaluation of language models.

28 Aug 2020 6,569

CodeGeeX2

CodeGeeX2: A More Powerful Multilingual Code Generation Model

23 Jul 2023 7,626

vec2text

utilities for decoding deep representations (like sentence embeddings) back to text

25 Feb 2023 680

long_llama

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA a...

06 Jul 2023 1,448

code2vec

TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Re...

24 Jul 2018 1,096

open-instruct

09 Jun 2023 1,214