[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
APACHE-2.0 License
DeepCache: Accelerating Diffusion Models for Free Xinyin Ma, Gongfan Fang, Xinchao Wang Learning and Vision Lab, National University of Singapore π₯―[Arxiv]π[Project Page]
December 20, 2023: Release the code for DDPM. See here for the experimental code and instructions.
December 6, 2023: Release the code for Stable Diffusion XL. The results of the stabilityai/stable-diffusion-xl-base-1.0
are shown in the below figure, with the same prompts from the first figure.
We introduce DeepCache, a novel training-free and almost lossless paradigm that accelerates diffusion models from the perspective of model architecture. Utilizing the property of the U-Net, we reuse the high-level features while updating the low-level features in a very cheap way. DeepCache accelerates Stable Diffusion v1.5 by 2.3x with only a 0.05 decline in CLIP Score, and LDM-4-G(ImageNet) by 4.1x with a 0.22 decrease in FID.
pip install DeepCache
import torch
# Loading the original pipeline
from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained('runwayml/stable-diffusion-v1-5', torch_dtype=torch.float16).to("cuda:0")
# Import the DeepCacheSDHelper
from DeepCache import DeepCacheSDHelper
helper = DeepCacheSDHelper(pipe=pipe)
helper.set_params(
cache_interval=3,
cache_branch_id=0,
)
helper.enable()
# Generate Image
deepcache_image = pipe(
prompt,
output_type='pt'
).images[0]
helper.disable()
We here take the Stable Diffusion pipeline as an example. You can replace pipe with any variants of the Stable Diffusion pipeline, including choices like SDXL, SVD, and more. You can find examples in the script. The argument cache_branch_id
specifies the selected skip branch. For the skip branches that are deeper, the model will engage them only during the caching steps, and exclude them during the retrieval steps. The argument cache_interval
represents the interval for updating the cache.
python main.py --model_type sdxl #Support [sdxl, sd1.5, sd2.1, svd, sd-inpaint, sdxl-inpaint, sd-img2img]
The above implementation does not require changes to the forward
or __call__
functions in the Diffusers pipeline, and is, therefore, more general. The following section is the experimental code that can be used to reproduce the results in the paper. It is implemented one by one for different model structures and pipelines, and thus, may not work properly due to the update of diffusers.
pip install diffusers==0.24.0 transformers
python stable_diffusion_xl.py --model stabilityai/stable-diffusion-xl-base-1.0
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:01<00:00, 6.62it/s]
2023-12-06 01:44:28,578 - INFO - Running baseline...
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 50/50 [00:17<00:00, 2.93it/s]
2023-12-06 01:44:46,095 - INFO - Baseline: 17.52 seconds
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:00<00:00, 8.06it/s]
2023-12-06 01:45:02,865 - INFO - Running DeepCache...
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 50/50 [00:06<00:00, 8.01it/s]
2023-12-06 01:45:09,573 - INFO - DeepCache: 6.71 seconds
2023-12-06 01:45:10,678 - INFO - Saved to output.png. Done!
You can add --refine
at the end of the command to activate the refiner model for SDXL.
python stable_diffusion.py --model runwayml/stable-diffusion-v1-5
2023-12-03 16:18:13,636 - INFO - Loaded safety_checker as StableDiffusionSafetyChecker from `safety_checker` subfolder of runwayml/stable-diffusion-v1-5.
2023-12-03 16:18:13,699 - INFO - Loaded vae as AutoencoderKL from `vae` subfolder of runwayml/stable-diffusion-v1-5.
Loading pipeline components...: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7/7 [00:01<00:00, 5.88it/s]
2023-12-03 16:18:22,837 - INFO - Running baseline...
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 50/50 [00:03<00:00, 15.33it/s]
2023-12-03 16:18:26,174 - INFO - Baseline: 3.34 seconds
2023-12-03 16:18:26,174 - INFO - Running DeepCache...
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 50/50 [00:01<00:00, 34.06it/s]
2023-12-03 16:18:27,718 - INFO - DeepCache: 1.54 seconds
2023-12-03 16:18:27,935 - INFO - Saved to output.png. Done!
python stable_diffusion.py --model stabilityai/stable-diffusion-2-1
2023-12-03 16:21:17,858 - INFO - Loaded feature_extractor as CLIPImageProcessor from `feature_extractor` subfolder of stabilityai/stable-diffusion-2-1.
2023-12-03 16:21:17,864 - INFO - Loaded scheduler as DDIMScheduler from `scheduler` subfolder of stabilityai/stable-diffusion-2-1.
Loading pipeline components...: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 6/6 [00:01<00:00, 5.35it/s]
2023-12-03 16:21:49,770 - INFO - Running baseline...
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 50/50 [00:14<00:00, 3.42it/s]
2023-12-03 16:22:04,551 - INFO - Baseline: 14.78 seconds
2023-12-03 16:22:04,551 - INFO - Running DeepCache...
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 50/50 [00:08<00:00, 6.10it/s]
2023-12-03 16:22:12,911 - INFO - DeepCache: 8.36 seconds
2023-12-03 16:22:13,417 - INFO - Saved to output.png. Done!
Currently, our code supports the models that can be loaded by StableDiffusionPipeline. You can specify the model name by the argument --model
, which by default, is runwayml/stable-diffusion-v1-5
.
python stable_video_diffusion.py
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 5/5 [00:00<00:00, 8.36it/s]
2023-12-21 04:56:47,329 - INFO - Running baseline...
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 25/25 [01:27<00:00, 3.49s/it]
2023-12-21 04:58:26,121 - INFO - Origin: 98.66 seconds
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 5/5 [00:00<00:00, 10.59it/s]
2023-12-21 04:58:27,202 - INFO - Running DeepCache...
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 25/25 [00:49<00:00, 1.96s/it]
2023-12-21 04:59:26,607 - INFO - DeepCache: 59.31 seconds
Please check here for the experimental code of DDPM and LDM.
Images in the upper line are the baselines and the images in the lower line are accelerated by DeepCache.
More results can be found in our paper
We sincerely thank the authors listed below who implemented DeepCache in plugins or other contexts.
We warmly welcome contributions from everyone. Please feel free to reach out to us.
@inproceedings{ma2023deepcache,
title={DeepCache: Accelerating Diffusion Models for Free},
author={Ma, Xinyin and Fang, Gongfan and Wang, Xinchao},
booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2024}
}