Open Source Ecosystems

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

🔥🔥 Our dedicated high-resolution I2V model is released at: 👉DynamiCrafter!!!

🔥The VideoCrafter2 Large improvements over VideoCrafter1 with limited data. Better Motion, Better Concept Combination!!!

Please Join us and create your own film on Discord/Floor33.

🎥 Exquisite film, produced by VideoCrafter2, directed by Human

🔆 Introduction

🤗🤗🤗 VideoCrafter is an open-source video generation and editing toolbox for crafting video content. It currently includes the Text2Video and Image2Video models:

1. Generic Text-to-video Generation

Click the GIF to access the high-resolution video.

2. Generic Image-to-video Generation

💥 You are highly recommended to try our dedicated I2V model DynamiCrafter: Higher resolution, Better Dynamics, More Coherence!!!

📝 Changelog

[2024.02.05]: 🔥🔥 Release new I2V model with the resolution of 640x1024 of VideoCrafter1/DynamiCrafter.
[2024.01.26]: Release the 512x320 checkpoint of VideoCrafter2.
[2024.01.18]: Release the VideoCrafter2 and Tech Report!
[2023.10.30]: Release VideoCrafter1 Technical Report!
[2023.10.13]: Release the VideoCrafter1, High Quality Video Generation!
[2023.08.14]: Release a new version of VideoCrafter on Discord/Floor33. Please join us to create your own film!
[2023.04.18]: Release a VideoControl model with most of the watermarks removed!
[2023.04.05]: Release pretrained Text-to-Video models, VideoLora models, and inference code.

⏳ Models

T2V-Models	Resolution	Checkpoints
VideoCrafter2	320x512	Hugging Face
VideoCrafter1	576x1024	Hugging Face
VideoCrafter1	320x512	Hugging Face

I2V-Models	Resolution	Checkpoints
VideoCrafter1	640x1024	Hugging Face
VideoCrafter1	320x512	Hugging Face

⚙️ Setup

1. Install Environment via Anaconda (Recommended)

conda create -n videocrafter python=3.8.5
conda activate videocrafter
pip install -r requirements.txt

💫 Inference

1. Text-to-Video

Download pretrained T2V models via Hugging Face, and put the model.ckpt in checkpoints/base_512_v2/model.ckpt.
Input the following commands in terminal.

  sh scripts/run_text2video.sh

2. Image-to-Video

Download pretrained I2V models via Hugging Face, and put the model.ckpt in checkpoints/i2v_512_v1/model.ckpt.
Input the following commands in terminal.

  sh scripts/run_image2video.sh

3. Local Gradio demo

Download the pretrained T2V and I2V models and put them in the corresponding directory according to the previous guidelines.
Input the following commands in terminal.

  python gradio_app.py

📋 Techinical Report

😉 VideoCrafter2 Tech report: VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

😉 VideoCrafter1 Tech report: VideoCrafter1: Open Diffusion Models for High-Quality Video Generation

😉 Citation

The technical report is currently unavailable as it is still in preparation. You can cite the paper of our image-to-video model and related base model.

@misc{chen2024videocrafter2,
      title={VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models}, 
      author={Haoxin Chen and Yong Zhang and Xiaodong Cun and Menghan Xia and Xintao Wang and Chao Weng and Ying Shan},
      year={2024},
      eprint={2401.09047},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@misc{chen2023videocrafter1,
      title={VideoCrafter1: Open Diffusion Models for High-Quality Video Generation}, 
      author={Haoxin Chen and Menghan Xia and Yingqing He and Yong Zhang and Xiaodong Cun and Shaoshu Yang and Jinbo Xing and Yaofang Liu and Qifeng Chen and Xintao Wang and Chao Weng and Ying Shan},
      year={2023},
      eprint={2310.19512},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@article{xing2023dynamicrafter,
      title={DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors}, 
      author={Jinbo Xing and Menghan Xia and Yong Zhang and Haoxin Chen and Xintao Wang and Tien-Tsin Wong and Ying Shan},
      year={2023},
      eprint={2310.12190},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@article{he2022lvdm,
      title={Latent Video Diffusion Models for High-Fidelity Long Video Generation}, 
      author={Yingqing He and Tianyu Yang and Yong Zhang and Ying Shan and Qifeng Chen},
      year={2022},
      eprint={2211.13221},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

🤗 Acknowledgements

Our codebase builds on Stable Diffusion. Thanks the authors for sharing their awesome codebases!

📢 Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.

Badges

Extracted from project README

Related Projects

StreamingT2V

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

18 Mar 2024 1,368

VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on di...

06 Nov 2023 2,650

DynamiCrafter

[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

27 Nov 2023 2,449

sygil-webui

Stable Diffusion web UI

24 Aug 2022 7,850

Versatile-Diffusion

Versatile Diffusion: Text, Images and Variations All in One Diffusion Model, arXiv 2022 / ICCV 2023

02 Nov 2022 1,312

stablediffusion

High-Resolution Image Synthesis with Latent Diffusion Models

23 Nov 2022 38,625

TrackDiffusion

Official PyTorch implementation of TrackDiffusion (https://arxiv.org/abs/2312.00651)

01 Dec 2023 53

SVD_Xtend

Stable Video Diffusion Training Code and Extensions.

12 Dec 2023 491

stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.

06 Oct 2022 8,194

CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

29 May 2022 7,818

VRT

VRT: A Video Restoration Transformer (official repository)

18 Jan 2022 1,352

minisora

MiniSora: A community aims to explore the implementation path and future development direction of...

21 Feb 2024 1,176

StableSR

[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution

02 Apr 2023 2,125

video-diffusion-pytorch

Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Genera...

08 Apr 2022 1,235

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

22 Mar 2024 4,541