TCFormer

The codes for TCFormer in paper: Not All Tokens Are Equal: Human-centric Visual Analysis via Token Clustering Transformer

APACHE-2.0 License

Stars

199

View Code on GitHub

Ecosystems: Python

Statistics for this project are still being loaded, please check back later.

Related Projects

MetaTransformer

Meta-Transformer for Unified Multimodal Learning

08 Jul 2023 1,506

SMPLer-X

Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"

07 Jun 2023 928

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large...

09 Nov 2023 1,314

Zolly

[ICCV2023 oral] Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstru...

12 Mar 2023 84

SparseR-CNN

[CVPR2021, PAMI2023] End-to-End Object Detection with Learnable Proposal

19 Nov 2020 1,309

AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

23 Jan 2020 3,367

OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities...

29 Jan 2022 2,401

F-LMM

Code Release of F-LMM: Grounding Frozen Large Multimodal Models

28 Mar 2024 28

Cream

This is a collection of our NAS and Vision Transformer work.

12 Oct 2020 1,583

GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

02 Sep 2024 5,166

xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image capt...

25 Jun 2021 1,020

PVT

Official implementation of PVT series

24 Feb 2021 1,711

ViPNAS

The official repo for CVPR2021——ViPNAS: Efficient Video Pose Estimation via Neural Architecture S...

30 May 2021 42

InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable...

10 Nov 2022 2,486

DB

A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".

18 Nov 2019 2,084