Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)
APACHE-2.0 License
Statistics for this project are still being loaded, please check back later.
An open platform for training, serving, and evaluating large language models. Release repo for Vi...
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA a...
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Mixture-of-Experts for Large Vision-Language Models
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and A...
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
a state-of-the-art-level open visual language model | 多模态预训练模型
Code Release of F-LMM: Grounding Frozen Large Multimodal Models
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2....
VILA - a multi-image visual language model with training, inference and evaluation recipe, deploy...