This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.
APACHE-2.0 License
This is the code for Nerfies: Deformable Neural Radiance Fields.
This codebase is implemented using JAX, building on JaxNeRF.
This repository has been updated to reflect the version used for our ICCV 2021 submission.
We provide an easy-to-get-started demo using Google Colab!
These Colabs will allow you to train a basic version of our method using Cloud TPUs (or GPUs) on Google Colab.
Note that due to limited compute resources available, these are not the fully featured models. If you would like to train a fully featured Nerfie, please refer to the instructions below on how to train on your own machine.
Description | Link |
---|---|
Process a video into a Nerfie dataset | |
Train a Nerfie | |
Render a Nerfie video |
The code can be run under any environment with Python 3.8 and above. (It may run with lower versions, but we have not tested it).
We recommend using Miniconda and setting up an environment:
conda create --name nerfies python=3.8
Next, install the required packages:
pip install -r requirements.txt
Install the appropriate JAX distribution for your environment by following the instructions here. For example:
# For CUDA version 11.0
pip install --upgrade "jax[cuda111]" -f https://storage.googleapis.com/jax-releases/jax_releases.html
After preparing a dataset, you can train a Nerfie by running:
export DATASET_PATH=/path/to/dataset
export EXPERIMENT_PATH=/path/to/save/experiment/to
python train.py \
--data_dir $DATASET_PATH \
--base_folder $EXPERIMENT_PATH \
--gin_configs configs/test_vrig.gin
To plot telemetry to Tensorboard and render checkpoints on the fly, also launch an evaluation job by running:
python eval.py \
--data_dir $DATASET_PATH \
--base_folder $EXPERIMENT_PATH \
--gin_configs configs/test_vrig.gin
The two jobs should use a mutually exclusive set of GPUs. This division allows the training job to run without having to stop for evaluation.
config.py
for documentation on what each configuration does.gpu_vrig_paper.gin
: This is the configuration we used to generate the table in the paper. It requires 8 GPUs for training.gpu_fullhd.gin
: This is a high-resolution model and will take around 3 days to train on 8 GPUs.gpu_quarterhd.gin
: This is a low-resolution model and will take around 14 hours to train on 8 GPUs.test_local.gin
: This is a test configuration to see if the code runs. It probably will not result in a good looking result.test_vrig.gin
: This is a test configuration to see if the code runs for validation rig captures. It probably will not result in a good looking result.gpu_quarterhd_4gpu.gin
but we have not tested it, so please only use it as a reference.A dataset is a directory with the following structure:
dataset
├── camera
│ └── ${item_id}.json
├── camera-paths
├── rgb
│ ├── ${scale}x
│ └── └── ${item_id}.png
├── metadata.json
├── points.npy
├── dataset.json
└── scene.json
At a high level, a dataset is simply the following:
We have a unique identifier for each image which we call item_id
, and this is
used to match the camera and images. An item_id
can be any string, but typically
it is some alphanumeric string such as 000054
.
camera
Camera
class defined in camera.py
and looks like this:{
// A 3x3 world-to-camera rotation matrix representing the camera orientation.
"orientation": [
[0.9839, -0.0968, 0.1499],
[-0.0350, -0.9284, -0.3699],
[0.1749, 0.358, -0.9168]
],
// The 3D position of the camera in world-space.
"position": [-0.3236, -3.26428, 5.4160],
// The focal length of the camera.
"focal_length": 2691,
// The principle point [u_0, v_0] of the camera.
"principal_point": [1220, 1652],
// The skew of the camera.
"skew": 0.0,
// The aspect ratio for the camera pixels.
"pixel_aspect_ratio": 1.0,
// Parameters for the radial distortion of the camera.
"radial_distortion": [0.1004, -0.2090, 0.0],
// Parameters for the tangential distortion of the camera.
"tangential": [0.001109, -2.5733e-05],
// The image width and height in pixels.
"image_size": [2448, 3264]
}
camera-paths
rgb
${scale}x
where ${scale}
is an integer scaling factor. For example, 1x
would contain the original images while 4x
would contain images a quarter of the size.metadata.json
{
"${item_id}": {
// The embedding ID used to fetch the deformation latent code
// passed to the deformation field.
"warp_id": 0,
// The embedding ID used to fetch the appearance latent code
// which is passed to the second branch of the template NeRF.
"appearance_id": 0,
// For validation rig datasets, we use the camera ID instead
// of the appearance ID. For example, this would be '0' for the
// left camera and '1' for the right camera. This can potentially
// also be used for multi-view setups as well.
"camera_id": 0
},
...
},
scene.json
{
// The scale factor we will apply to the pointcloud and cameras. This is
// important since it controls what scale is used when computing the positional
// encoding.
"scale": 0.0387243672920458,
// Defines the origin of the scene. The scene will be translated such that
// this point becomes the origin. Defined in unscaled coordinates.
"center": [
1.1770838526103944e-08,
-2.58235339289195,
-1.29117656263135
],
// The distance of the near plane from the camera center in scaled coordinates.
"near": 0.02057418950149491,
// The distance of the far plane from the camera center in scaled coordinates.
"far": 0.8261601717667288
}
dataset.json
{
// The total number of images in the dataset.
"count": 114,
// The total number of training images (exemplars) in the dataset.
"num_exemplars": 57,
// A list containins all item IDs in the dataset.
"ids": [...],
// A list containing all training item IDs in the dataset.
"train_ids": [...],
// A list containing all validation item IDs in the dataset.
// This should be mutually exclusive with `train_ids`.
"val_ids": [...],
}
points.npy
(N,3)
containing the background points.If you find our work useful, please consider citing:
@article{park2021nerfies
author = {Park, Keunhong
and Sinha, Utkarsh
and Barron, Jonathan T.
and Bouaziz, Sofien
and Goldman, Dan B
and Seitz, Steven M.
and Martin-Brualla, Ricardo},
title = {Nerfies: Deformable Neural Radiance Fields},
journal = {ICCV},
year = {2021},
}