two_hot_encoding | Python Ecosystem Directory

Statistics for this project are still being loaded, please check back later.

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Guide to using pre-trained large language models of source code

Tensorflow implementation of contextualized word representations from bi-directional language models

Simple web service providing a word embedding model

Explore large language models in 512MB of RAM

Pipeline for training Language Models using PyTorch.

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Home of StarCoder: fine-tuning & inference!

Code for paper Fine-tune BERT for Extractive Summarization

Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSE

GLM (General Language Model)

Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.

한국어 문장 띄어쓰기(삭제/추가) 모델입니다. 데이터 준비 후 직접 학습이 가능하도록 작성하였습니다.

Visual Attention based OCR

Word2vec (word to vectors) approach for Japanese language using Gensim and Mecab.