aloha

A new reliable, localizable, and generalizable metric for hallucination detection in image captioning models.

Stars

View Code on GitHub View on X

Ecosystems: Python

Statistics for this project are still being loaded, please check back later.

Related Projects

mm-cot

Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tune...

02 Feb 2023 3,760

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

18 Sep 2023 5,913

open-instruct

09 Jun 2023 1,214

less-is-more

Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)

22 Feb 2024 15

STT

A multi-task model which does image captioning, sentence paraphrasing and cross-modal retrieval.

25 Dec 2018 18

caption-by-committee

Using LLMs and pre-trained caption models for super-human performance on image captioning.

14 Dec 2022 27

MoA

Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models

04 Jun 2024 2,554

pretraining-with-human-feedback

Code accompanying the paper Pretraining Language Models with Human Preferences

20 Feb 2023 175

CoCa-pytorch

Implementation of CoCa, Contrastive Captioners are Image-Text Foundation Models, in Pytorch

05 May 2022 977

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities...

29 Jan 2022 2,401

awesome-foundation-and-multimodal-models

👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + ...

08 Oct 2023 518

ImageCaptioning.pytorch

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch ...

10 Feb 2017 1,419

clair

CLAIR: A (surprisingly) simple semantic text metric with large language models.

28 Jun 2023 10

xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image capt...

25 Jun 2021 1,020

POPE

[EMNLP'23] The official GitHub page for ''Evaluating Object Hallucination in Large Vision-Languag...

17 May 2023 49