Evaluating text-to-image/video/3D models with VQAScore
APACHE-2.0 License
A task generation and model evaluation system.
Code for EMNLP 2018 paper "Commonsense for Generative Multi-Hop Question Answering Tasks"
Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for ...
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregress...
Train vision models using JAX and 🤗 transformers
a state-of-the-art-level open visual language model | 多模态预训练模型
Official repo for VGen: a holistic video generation ecosystem for video generation building on di...
Chain-of-Hindsight, A Scalable RLHF Method
[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning fr...
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Toolkit for Visual7W visual question answering dataset
👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including LPIPS, FID, NIQE, NRQM(Ma), MUSIQ,...
ViLMA: A Zero-Shot Benchmark for Linguistic and Temporal Grounding in Video-Language Models (ICLR...
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tune...