Pytorch codes for "Learning Spatial Attention for Face Super-Resolution", TIP 2020.
OTHER License
Learning Spatial Attention for Face Super-Resolution Chaofeng Chen, Dihong Gong, Hao Wang, Zhifeng Li, Kwan-Yee K. Wong
Clone this repository
git clone https://github.com/chaofengc/Face-SPARNet.git
cd Face-SPARNet
I have tested the codes on
pip3 install -r requirements.txt
Download the pretrained models and data from the following link and put them to ./pretrain_models
and ./test_dirs
respectively
2nax
We provide example test commands in script test.sh
for both SPARNet and SPARNetHD. Two models with difference configurations are provided for each of them, refer to section below to see the differences. Here are some test tips:
--dataroot
option.--save_as_dir
, otherwise the results will be saved to predefined directory results/exp_name/test_latest
.We also provide command to crop and align faces from single image, and then paste them back, the same as PSFRGAN
python test_enhance_single_unalign.py --gpus 1 --model sparnethd --name SPARNetHD_V4_Attn2D \
--res_depth 10 --att_name spar --Gnorm 'in' \
--pretrain_model_path [./path/to/model/SPARNetHD_V4_Attn2D_net_H-epoch10.pth] \
--test_img_path ./test_images/test_hzgg.jpg --results_dir test_hzgg_results
The commands used to train the released models are provided in script train.sh
. Here are some train tips:
--dataroot
to the path where your training images are stored.--name
option for different experiments. Tensorboard records with the same name will be moved to check_points/log_archive
, and the weight directory will only store weight history of latest experiment with the same name.--gpus
specify number of GPUs used to train. The script will use GPUs with more available memory first. To specify the GPU index, uncomment the export CUDA_VISIBLE_DEVICES=
batch_size=2
.Since the original codes are messed up, we rewrite the codes and retrain all models. This leads to slightly different results between the released model and those reported in the paper. Besides, we also extend the 2D spatial attention to 3D attention, and release some models with 3D attention. We list all of them below
We found that extending 2D spatial attention to 3D attention improves the performance a lot. We trained a light model with half parameter number by reducing the number of FAU blocks, denoted as SPARNet-Light-Attn3D. SPARNet-Light-Attn3D shows similar performance with SPARNet. We also released the model for your reference.
Model | DICNet | SPARNet (in paper) | SPARNet (Released) | SPARNet-Light-Attn3D (Released) |
---|---|---|---|---|
#Params(M) | 22.8 | 9.86 | 10.52 | 5.24 |
PSNR (↑) | 26.73 | 26.97 | 27.43 | 27.39 |
SSIM (↑) | 0.7955 | 0.8026 | 0.8201 | 0.8189 |
All models are trained with CelebA and tested on Helen test set provided by DICNet
We also provide network with 2D and 3D attention for SPARNetHD. For the test dataset, we clean up non-face images, add some extra test images from internet, and obtain a new CelebA-TestN dataset with 1117 images. We test the retrained model on the new dataset and recalculate the FID scores.
Similar as StyleGAN, we use the exponential moving average weight as the final model, which shows slightly better results.
Model | SPARNetHD (in paper) | SPARNetHD-Attn2D (Released) | SPARNetHD-Attn3D (Released) |
---|---|---|---|
FID (↓) | 27.16 | 26.72 | 28.42 |
@InProceedings{ChenSPARNet,
author = {Chen, Chaofeng and Gong, Dihong and Wang, Hao and Li, Zhifeng and Wong, Kwan-Yee~K.},
title = {Learning Spatial Attention for Face Super-Resolution},
Journal = {IEEE Transactions on Image Processing (TIP)},
year = {2020}
}
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The codes are based on CycleGAN. The project also benefits from DICNet.