[CVPR 2024] Official Code for "AiOS: All-in-One-Stage Expressive Human Pose and Shape Estimation

OTHER License



The file structure should be like:

├── config/
└── data
    ├── body_models
    |   ├── smplx
    |   |   ├──MANO_SMPLX_vertex_ids.pkl
    |   |   ├──SMPL-X__FLAME_vertex_ids.npy
    |   |   ├──SMPLX_NEUTRAL.pkl
    |   |   ├──SMPLX_to_J14.pkl
    |   |   ├──SMPLX_NEUTRAL.npz
    |   |   ├──SMPLX_MALE.npz
    |   |   └──SMPLX_FEMALE.npz
    |   └── smpl
    |       ├──SMPL_FEMALE.pkl
    |       ├──SMPL_MALE.pkl
    |       └──SMPL_NEUTRAL.pkl
    ├── preprocessed_npz
    │   └── cache
    |       ├──agora_train_3840_w_occ_cache_2010.npz
    |       ├──bedlam_train_cache_080824.npz
    |       ├──...
    |       └──coco_train_cache_080824.npz
    ├── checkpoint
    │   └── aios_checkpoint.pth
    ├── datasets
    │   ├── agora
    |   │    └──3840x2160
    │   │        ├──train
    │   │        └──test
    │   ├── bedlam
    │   │     ├──train_images
    │   │     └──test_images
    │   ├── ARCTIC
    │   │     ├──s01
    │   │     ├──s02
    │   │     ├──...   
    │   │     └──s10
    │   ├── EgoBody
    │   │     ├──egocentric_color
    │   │     └──kinect_color
    │   └── UBody
    |       └──images
    └── checkpoint
        ├── edpose_r50_coco.pth
        └── aios_checkpoint.pth


# Create a conda virtual environment and activate it.
conda create -n aios python=3.8 -y
conda activate aios

# Install PyTorch and torchvision.
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge

# Install Pytorch3D
git clone -b v0.6.1
cd pytorch3d
pip install -v -e .
cd ..

# Install MMCV, build from source
git clone -b v1.6.1
cd mmcv
export MMCV_WITH_OPS=1
export FORCE_MLU=1
pip install -v -e .
cd ..

# Install other dependencies
conda install -c conda-forge ffmpeg
pip install -r requirements.txt 

# Build deformable detr
cd models/aios/ops
python build install
cd ../../..


  • Place the mp4 video for inference under AiOS/demo/
  • Prepare the pretrained models to be used for inference under AiOS/data/checkpoint
  • Inference output will be saved in AiOS/demo/{INPUT_VIDEO}_out
# CHECKPOINT: checkpoint path
# INPUT_VIDEO: input video path
# OUTPUT_DIR: output path
# NUM_PERSON: num of person. This parameter sets the expected number of persons to be detected in the input (image or video). 
#   The default value is 1, meaning the algorithm will try to detect at least one person. If you know the maximum number of persons
#   that can appear simultaneously, you can set this variable to that number to optimize the detection process (a lower threshold is recommended as well).
# THRESHOLD: socre threshold. This parameter sets the score threshold for person detection. The default value is 0.5. 
#   If the confidence score of a detected person is lower than this threshold, the detection will be discarded. 
#   Adjusting this threshold can help in filtering out false positives or ensuring only high-confidence detections are considered.
# GPU_NUM: GPU num. 

# For inferencing short_video.mp4 with output directory of demo/short_video_out
sh scripts/ data/checkpoint/aios_checkpoint.pth short_video.mp4 demo 2 0.1 8


a. Make test_result dir

mkdir test_result

b. AGORA Validatoin

Run the following command and it will generate a 'predictions/' result folder which can evaluate with the agora evaluation tool

sh scripts/ data/checkpoint/aios_checkpoint.pth agora_val

b. AGORA Test Leaderboard

Run the following command and it will generate a '' which can be submitted to AGORA Leaderborad

sh scripts/ data/checkpoint/aios_checkpoint.pth agora_test


Run the following command and it will generate a '' which can be submitted to BEDLAM Leaderborad

sh scripts/ data/checkpoint/aios_checkpoint.pth bedlam_test


Some of the codes are based on MMHuman3D, ED-Pose and SMPLer-X.

Related Projects