[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model
MIT License
Motion Latent Diffusion (MLD) is a text-to-motion and action-to-motion diffusion model. Our work achieves state-of-the-art motion quality and two orders of magnitude faster than previous diffusion models on raw motion data.
conda create python=3.9 --name mld
conda activate mld
Install the packages in requirements.txt
and install PyTorch 1.12.1
pip install -r requirements.txt
We test our code on Python 3.9.12 and PyTorch 1.12.1.
Run the script to download dependencies materials:
bash prepare/download_smpl_model.sh
bash prepare/prepare_clip.sh
For Text to Motion Evaluation
bash prepare/download_t2m_evaluators.sh
Run the script to download the pre-train model
bash prepare/download_pretrained_models.sh
Visit the Google Driver to download the previous dependencies and model.
We support text file or keyboard input, the generated motions are npy files.
Please check the configs/asset.yaml
for path config, TEST.FOLDER as output folder.
Then, run the following script:
python demo.py --cfg ./configs/config_mld_humanml3d.yaml --cfg_assets ./configs/assets.yaml --example ./demo/example.txt
Some parameters:
--example=./demo/example.txt
: input file as text prompts--task=text_motion
: generate from the test set of dataset--task=random_sampling
: random motion sampling from noise --replication
: generate motions for same input texts multiple times--allinone
: store all generated motions in a single npy file with the shape of [num_samples, num_ replication, num_frames, num_joints, xyz]
The outputs:
npy file
: the generated motions with the shape of (nframe, 22, 3)text file
: the input text promptPlease refer to HumanML3D for text-to-motion dataset setup. We will provide instructions for other datasets soon.
Please first check the parameters in configs/config_vae_humanml3d.yaml
, e.g. NAME
,DEBUG
.
Then, run the following command:
python -m train --cfg configs/config_vae_humanml3d.yaml --cfg_assets configs/assets.yaml --batch_size 64 --nodebug
Please update the parameters in configs/config_mld_humanml3d.yaml
, e.g. NAME
,DEBUG
,PRETRAINED_VAE
(change to your latest ckpt model path
in previous step)
Then, run the following command:
python -m train --cfg configs/config_mld_humanml3d.yaml --cfg_assets configs/assets.yaml --batch_size 64 --nodebug
Please first put the tained model checkpoint path to TEST.CHECKPOINT
in configs/config_mld_humanml3d.yaml
.
Then, run the following command:
python -m test --cfg configs/config_mld_humanml3d.yaml --cfg_assets configs/assets.yaml
Refer to TEMOS-Rendering motions for blender setup, then install the following dependencies.
YOUR_BLENDER_PYTHON_PATH/python -m pip install -r prepare/requirements_render.txt
Run the following command using blender:
YOUR_BLENDER_PATH/blender --background --python render.py -- --cfg=./configs/render.yaml --dir=YOUR_NPY_FOLDER --mode=video --joint_type=HumanML3D
python -m fit --dir YOUR_NPY_FOLDER --save_folder TEMP_PLY_FOLDER --cuda
This outputs:
mesh npy file
: the generate SMPL vertices with the shape of (nframe, 6893, 3)ply files
: the ply mesh file for blender or meshlabRun the following command to render SMPL using blender:
YOUR_BLENDER_PATH/blender --background --python render.py -- --cfg=./configs/render.yaml --dir=YOUR_NPY_FOLDER --mode=video --joint_type=HumanML3D
optional parameters:
--mode=video
: render mp4 video--mode=sequence
: render the whole motion in a png image.If your demo results have a severe issue on foot sliding, please take a look to the below. It could happen when self.feats2joints
(use mean and std for de-normalization) is broken.
https://github.com/ChenFengYe/motion-latent-diffusion/blob/af507c479d771f62a058b5b6abb51276b36d6c6d/mld/models/modeltype/mld.py#L264
https://github.com/ChenFengYe/motion-latent-diffusion/blob/5c264c31fbc7ffc047be1ce003622f1865417e8f/mld/data/get_data.py#L26-L41
VAL_EVERY_STEPS
(Validation Frequency), DataIO Speed. Your training is a little slow.--nodebug
for all your training.python -m scripts.flops --cfg configs/your_config.yaml
python -m scripts.tsne --cfg configs/your_config.yaml
Note: This only support action-to-motion models for now.
If you find our code or paper helps, please consider citing:
@inproceedings{chen2023executing,
title={Executing your Commands via Motion Diffusion in Latent Space},
author={Chen, Xin and Jiang, Biao and Liu, Wen and Huang, Zilong and Fu, Bin and Chen, Tao and Yu, Gang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={18000--18010},
year={2023}
}
Thanks to TEMOS, ACTOR, HumanML3D and joints2smpl, our code is partially borrowing from them.
This code is distributed under an MIT LICENSE.
Note that our code depends on other libraries, including SMPL, SMPL-X, PyTorch3D, and uses datasets which each have their own respective licenses that must also be followed.