Point-Bind_Point-LLM

Align 3D Point Cloud with Multi-modalities for Large Language Models

MIT License

Stars

389

View Code on GitHub

Ecosystems: Python

Statistics for this project are still being loaded, please check back later.

Related Projects

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

18 Sep 2023 5,913

MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2....

15 Apr 2023 25,326

Zolly

[ICCV2023 oral] Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstru...

12 Mar 2023 84

Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

23 Oct 2023 2,881

LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

19 Mar 2023 5,713

ImageBind

ImageBind One Embedding Space to Bind Them All

23 Mar 2023 8,239

VisCPM

[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模...

30 Jun 2023 1,075

RandLA-Net

🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

25 Nov 2019 1,294

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

17 Jun 2024 1,703

SAM2Point

The Most Faithful Implementation of Segment Anything (SAM) in 3D

25 Aug 2024 234

F-LMM

Code Release of F-LMM: Grounding Frozen Large Multimodal Models

28 Mar 2024 28

MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

14 Dec 2023 1,932

LWM

08 Feb 2024 7,097

VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deploy...

23 Feb 2024 1,061

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large...

09 Nov 2023 1,314