teaser The first work aims to solve 3D Human Mesh Reconstruction task in perspective-distorted images.

🗓️ News:

🎆 2024.Jul.18, pretrained models are released: https://huggingface.co/WenjiaWang/Zolly_ckpts, most are better than the results in paper.

🎆 2023.Nov.23, the training code of Zolly is released, pretrained Zolly weight will come soon.

🎆 2023.Aug.12, Zolly is selected as ICCV2023 oral, project page.

🎆 2023.Aug.7, the dataset link is released. The training code is coming soon.

🎆 2023.Jul.14, Zolly is accepted to ICCV2023, codes and data will come soon.

🎆 2023.Mar.27, arxiv link is released.

🚀 Run the code

🌏 Environments

You should install MMHuman3D firstly.

You should install the needed relies as ffmpeg, torch, mmcv, pytorch3d following its tutorials.

It is recommended that you install the stable version of MMHuman3D:

wget https://github.com/open-mmlab/mmhuman3d/archive/refs/tags/v0.9.0.tar.gz;
tar -xvf v0.9.0.tar.gz;
cd mmhuman3d-0.9.0;
pip install -e .

You can install pytorch3d from file if you find any difficulty. E.g. python3.8 + pytorch-1.13.1 + cuda-11.7 + pytorch3d-0.7.4

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch3d/linux-64/pytorch3d-0.7.4-py38_cu117_pyt1131.tar.bz2;
pip install fvcore;
pip install iopath;
conda install --use-local pytorch3d-0.7.4-py38_cu117_pyt1131.tar.bz2;

install this repo

cd Zolly;
pip install -e .

📁 Required Data and Files

You can download the files from onedrive; or from huggingface by command huggingface-cli download WenjiaWang/Zolly_release --local-dir Zolly_release --repo-type dataset.

This link contains:

Dataset annotations: all have ground-truth focal length, translation and smpl parameters.
- HuMMan (train, test_p1(full), test_p2, test_p3)
- SPEC-MTP (test_p1(full), test_p2, test_p3)
- PDHuman (train, test_p1(full), test_p2, test_p3, test_p4, test_p5)
- 3DPW (train(has optimized neutral betas), test_p1(full), test_p2, test_p3)
Dataset images.
- HuMMan
- SPEC-MTP
- PDHuman
- For other open sourced datasets, please downlad from their original website.
Pretrained backbone
- hrnetw48_coco_pose.pth
- resnet50_coco_pose.pth
Others
- smpl_uv_decomr.npz
- mesh_downsampling.npz
- J_regressor_h36m.npy
SMPL skinning weights
- Please find in SMPL official link.

👇 Arrange the files

root
  ├── body_models
  │   └── smpl
  |       ├── J_regressor_extra.npy
  |       ├── J_regressor_h36m.npy
  |       ├── mesh_downsampling.npz
  |       ├── SMPL_FEMALE.pkl
  |       ├── SMPL_MALE.pkl
  |       ├── smpl_mean_params.npz
  |       ├── SMPL_NEUTRAL.pkl
  |       └── smpl_uv_decomr.npz
  ├── cache
  ├── mmhuman_data
  │   ├── datasets                                                                                
  |   │   ├── coco                                                                                   
  |   │   ├── h36m                                              
  |   │   ├── humman                                            
  |   │   ├── lspet                                             
  |   │   ├── mpii                                              
  |   │   ├── mpi_inf_3dhp                                      
  |   │   ├── pdhuman                                                                                 
  |   │   ├── pw3d                                              
  |   │   └── spec_mtp  
  │   └── preprocessed_datasets
  |       ├── humman_test_p1.npz
  |       ├── humman_train.npz
  |       ├── pdhuman_test_p1.npz
  |       ├── pdhuman_train.npz
  |       ├── pw3d_train.npz
  |       ├── pw3d_train_transl.npz
  |       ├── spec_mtp.npz
  |       └── spec_mtp_p1.npz
  └── pretrain
      └── coco_pretrain 
          ├── hrnetw48_coco_pose.pth
          └── resnet50_coco_pose.pth

And change the root in zolly/configs/base.py

🚅 Train

sh train_bash.sh zolly/configs/zolly_r50.py $num_gpu$ --work-dir=$your_workdir$

E.g, you can use

sh train_bash.sh zolly/configs/zolly_r50.py 8 --work-dir=work_dirs/zolly

To resume training or finetune model:

sh train_bash.sh zolly/configs/zolly_r50.py 8 --work-dir=work_dirs/zolly --resume-from work_dirs/zolly/latest.pth

🚗 Test

sh test_bash.sh zolly/configs/zolly/zolly_r50.py $num_gpu$ --checkpoint=$your_ckpt$ --data-name pw3d

For convenience, you can test the first 100 samples to evaluate your model.

sh test_bash.sh zolly/configs/zolly/zolly_r50.py $num_gpu$ --checkpoint=$your_ckpt$ --data-name pw3d --num-data 100

🎮 Demo images in a folder

sh demo_bash.sh zolly/configs/zolly/zolly_h48.py $num_gpu$ --checkpoint=$your_ckpt$ --image_folder assets/demo_jpg --ext jpg --demo_root demo/

The output name will be like 56_789-0.00_586-1.91_pred.png, which represent {raw_name}_{gt_f}-{gt_z}_{pred_f}-{pred_z}_pred.png

Pretrained Models:

We have released our R50 and H48 model on huggingface: https://huggingface.co/WenjiaWang/Zolly_ckpts

You can use huggingface-cli download WenjiaWang/Zolly_ckpts --local-dir ckpts --repo-type model to download the model. (Remember to login with you token firstly)

We re-trained our method and updated the results for 3DPW, HuMMan, Pdhuman and Spec-MTP:

3DPW: most are better than the original paper!

Method	PA-MPJPE	MPJPE	PVE
Zolly-R50	48.92👍	79.18👍	92.82
Zolly-R50 (ft)	43.70👍	71.33👍	84.41
Zolly-H48	47.88👍	78.21	90.82
Zolly-H48(ft)	39.09👍	64.44👍	75.78👍

SPEC-MTP (p3): comparable to the original paper version

Method	PA-MPJPE	MPJPE	PVE
Zolly-R50	75.34	126.66	140.69
Zolly-H48	67.47	115.74	127.96

HuMMan (p3): partially better than the original paper version

Method	PA-MPJPE	MPJPE	PVE
Zolly-R50	24.57	35.88👍	43.49👍
Zolly-H48	22.94	33.39	37.93👍

PDHuman (p5): most are better than the original paper!

Method	PA-MPJPE	MPJPE	PVE
Zolly-R50	56.75	79.83👍	91.93👍
Zolly-H48	46.53👍	67.86👍	77.77👍

💻Add Your Algorithm

Add your network in zolly/models/heads, and add it to zolly/models/builder.py.
Add your trainer in zolly/models/architectures, and add it to zolly/models/architectures/builder.py.
Add your loss function in zolly/models/losses, and add it to zolly/models/losses/builder.py.
Add your config file in zolly/configs/, you can modify from zolly/configs/zolly_r50.py. And remember to change the root parameter in zolly/configs/base.py, where your files should be put.

🎓 Citation

If you find this project useful in your research, please consider citing us:

@inproceedings{wangzolly,
  title={Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction Supplementary Material},
  author={Wang, Wenjia and Ge, Yongtao and Mei, Haiyi and Cai, Zhongang and Sun, Qingping and Wang, Yanjun and Shen, Chunhua and Yang, Lei and Komura, Taku},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
  year={2023}
}

😁 Acknowledge

Emojis are collected from gist:7360908.

Some of the codes are based on MMHuman3D, DecoMR.

📧 Contact

Feel free to contact me for other questions or cooperation: [email protected]

Related Projects

zero123

Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV 2023)

17 Mar 2023 2,666

oplanes

[AAAI 2023] Official implementation of "Occupancy Planes for Single-view RGB-D Human Reconstruction"

20 Nov 2022 10

BiRefNet

[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation

17 Aug 2022 1,079

DecoMR

Repository for the paper " 3D Human Mesh Regression with Dense Correspondence "

29 Mar 2020 168

SMPLer-X

Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"

07 Jun 2023 928

4D-Humans

4DHumans: Reconstructing and Tracking Humans with Transformers

31 May 2023 1,204

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

22 Mar 2024 4,541

Monkey

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large...

09 Nov 2023 1,314

F-LMM

Code Release of F-LMM: Grounding Frozen Large Multimodal Models

28 Mar 2024 28

PanoHead

Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"

13 Jun 2023 1,905

HRNet-Semantic-Segmentation

The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This...

09 Apr 2019 3,130

ECON

[CVPR'23, Highlight] ECON: Explicit Clothed humans Optimized via Normal integration

02 Oct 2022 1,090

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

18 Sep 2023 5,913

DisCo

[CVPR2024] DisCo: Referring Human Dance Generation in Real World

03 Jun 2023 1,054

AiOS

[CVPR 2024] Official Code for "AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

18 Mar 2024 177