First place solution for Open Cities DrivenData challenge
This is the code for winning solution for "Open Cities AI Challenge: Segmenting Buildings for Disaster Resilience".
Semantic Segmentation track: Build models to identify building footprints from aerial imagery across diverse African cities.
My solution is based on Unet-like CNN models and below you will find description of full pipeline and instructions how to run training, inference on competitions data or inference on your own data.
Archive with pretrained models could be loaded here (Google Drive).
Solution have been packed using Docker to simplify environment preparation.
All other packages and software specified in Dockerfile
and requirements.txt
Recommended minimal configuration:
* - during inference it is possible to reduce batch size to reduce memory consumption, however training configuration need at least 16GB.
Step1. Starting service
Build docker image, start docker-compose service in daemon mode and install requirements inside container.
$ make build && make start && make install
Step 2. Starting pipelines inside container
Data preparation
train_tier_1.tgz
and test.tgz
required)models/
directory<project_dir>/
data/
...
raw/
train_tier_1.tgz
test.tgz
models/
stage1/
...
stage2/
...
stage3/
...
Start only inference (models/
directory should be provided with pretrained models)
$ make inference
Start training and inference
$ make train-inference
After pipeline execution final prediction will appear in data/predictions/stage3/sliced/
directory.
Step 3. Stop service
After everything is done stop docker container
$ make clean
Before start, please, make sure:
train_tier_1.tgz
and test.tgz
) in data/raw/
foldermodels/
)The whole pipeline consist of 4 stages stages 1 to 3 steps each. The whole structure is as follows:
train_tier_1
)stage 1
and prepare as train data (pseudolabels)train_tier_1
and stage_2
data (pseudolabels)stage 2
and prepare as train data (pseudolabels round 2)train_tier_1
and stage_3
data (pseudolabels round 2)After whole pipeline execution final prediction will appear in data/predictions/stage3/sliced/
directory. In case you have pretrained models you can run just stage 0 step 0
and stage 3 step 3
blocks.
Command to run:
$ make stage0
Consist of one step - preparaing test.tgz
data.
We will extract data and create mosaic from test tiles (it is better to stitch separate tiles into big images (scenes), so prediction network will have more context). CSV file with data about mosaic is located in data/processed/test_mosaic.csv
and created by jyputer notebook (notebooks/mosaic.ipynb
). You dont need to generate it again, it is already exist.
Train data preparation, models training and prediction with ensemble of models (consist of three steps).
Command to run:
$ make stage1
Prepraing data:
train_tier_1
data1024 x 1024
(more convinient for training the models)On this step 10 Unet
models are going to be trained on data. 5 Unet models for eficientnet-b1
encoder and 5 for se_resnext_32x4d
encoder (all encoders are pretrained on Imagenet). We train 5 models for each encoder because of 5 folds validation scheme. Models trained with hard augmentations using albumetations
library and random data sampling. Training lasts 50 epochs with continious learining rate decay from 0.0001 to 0.
Pretrained models aggregeated to EnsembleModel
with test time augmentation (flip, rotate90) - all predicitons averaged by simple mean and thresholded by 0.5 value. First, prediction is made for stitched test images, than for others (which not present on mosaic).
Train data preparation (add pseudolabels), models training and prediction with ensemble of models (consist of three steps).
Command to run:
$ make stage2
Take predictions from previous stage and prepare them to use as training data for current stage. This technique is called pseudolabeling (when we use models prediction for training). I used all test data because leadeboard score was high enough (~0.845 jaccard score), but usually you should take only confident predictions.
(same as stage 1 step 1, but with exta data labeled on previous stage)
On this step 10 Unet
models are going to be trained on data. 5 Unet models for eficientnet-b1
encoder and 5 for se_resnext_32x4d
encoder (all encoders are pretrained on Imagenet). We train 5 models for each encoder because of 5 folds validation scheme. Models trained with hard augmentations using albumetations
library and random data sampling. Trainings last 50 epochs with continious learining rate decay from 0.0001 to 0.
(same as stage 1 step 1)
Pretrained models aggregeated to EnsembleModel
with test time augmentation (flip, rotate90) - all predicitons averaged by simple mean and thresholded by 0.5 value. First, prediction is made for stitched test images, than for others (which not present on mosaic).
The stage is the same as previous one, just another round of pseudolabelng with better trained models. Train data preparation (pseudolabels), models training and prediction with ensemble of models (consist of three steps).
Command to run:
$ make stage3
Take predictions from previous stage and prepare them to use as training data for current stage (same as stage 2 step 1).
(same as stage 2 step 1, but with exta data labeled on previous stage)
On this step 4 Unet
models and 1 FPN
are going to be trained on tier 1
data and pseudolabels round 2
. Models trained with hard augmentations using albumetations
library and random data sampling. Training lasts 50 epochs with continious learining rate decay from 0.0001 to 0.
(same as stage 2 step 1)
Pretrained models aggregeated to EnsembleModel
with test time augmentation (flip, rotate90) - all predicitons averaged by simple mean and thresholded by 0.5 value. First, prediction is made for stitched test images, than for others (which not present on mosaic).
$ make build && make start && make install
.tif
file somewhere in data folder (make sure you reproject it in UTM zone and resample to 0.1 m/pixel). You can use GDAL for example.$ docker exec open-cities-dev \
python -m src.predict_tif \
--configs configs/stage3-srx50-2-f0.yaml \
--src_path <path/to/your/tif/file.tif> \
--dst_path <path/for/result.tif> \
--batch_size 8 \
--gpu 0 \
--tta
$ docker exec open-cities-dev python -m src.predict_tif --help