Train vision models using JAX and 🤗 transformers
DeepFloyd-IF (Imagen Free)
a state-of-the-art-level open visual language model | 多模态预训练模型
An open platform for training, serving, and evaluating large language models. Release repo for Vi...
[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模...
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tune...
Official PyTorch codes for "Enhancing Diffusion Models with Text-Encoder Reinforcement Learning",...
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
The Learnable Typewriter: A Generative Approach to Text Line Analysis
Using LLMs and pre-trained caption models for super-human performance on image captioning.
VILA - a multi-image visual language model with training, inference and evaluation recipe, deploy...
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA a...