Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection [ECCV2024]

1King Abdullah University of Science and Technology 2NEC Laboratories China

DIA is a fine-grained anomaly detection framework for medical images. We describe two novel components.

First, the dissolving transformations. Our main observation is that generative diffusion models are feature-aware and applying them to medical images in a certain manner can remove or diminish fine-grained discriminative features such as tumors or hemorrhaging. More visual demonstration about the dissolving effects are here.
Second, an amplifying framework. It is based on contrastive learning to learn a semantically meaningful representation of medical images in a self-supervised manner.

The amplifying framework contrasts additional pairs of images with and without dissolving transformations applied and thereby boosts the learning of fine-grained feature representations. DIA significantly improves the medical anomaly detection performance with around 18.40% AUC boost against the baseline method and achieves an overall SOTA against other benchmark methods.

News

[Sept. 2024] We released a huggingface space for you to play with dissolving transformations.
[July 2024] We are integrating to Kornia! You may access StableDiffusion-based dissolving transformations with kornia.filters.StableDiffusionDissolving or use it as an augmentation kornia.augmentation.RandomDissolving. Kornia welcomes contributors for their AI-based light-weight operations!
[July 2024] Our paper is accepted to ECCV2024!

1. Requirements

Environments

$ pip install -r requirements.txt

Datasets

We majorly use the following datasets to benchmark our method:

2. Training

Pretraining of diffusion models

Modify the data paths in diffusion_models/train_eval_ddpm.py, then run:

python train_eval_ddpm.py --mode train --dataset <DATASET>

In our experiments, the diffusion models are usable within 5 epochs.

Training DIA

To train unlabeled one-class & multi-class models in the paper, run this command:

python train.py --data_root ../data --dataset <DATASET> --model resnet18_imagenet --mode simclr_DIA --shift_trans_type diffusion_rotation --diff_resolution 32 --batch_size 32 --one_class_idx 0 --save_step 1  --diffusion_model_path <DIFFUSION_MODEL_PATH>

Other transformation types are available to use, e.g. diffusion_rotation, diffusion_cutperm, blurgaussian_rotation, blurgaussian_cutperm, blurmedian_rotation, blurmedian_cutperm. Note that cutperm may perform better if your dataset is more rotation-invariant, while the rotation may perform better if the dataset is more "permutation-invariant".
To test different resolutions, do remember to re-train the diffusion models for the corresponding resolution.
For low resolution datasets, e.g. CIFAR10, use --model resnet18 instead of --model resnet18_imagenet.

3. Evaluation

During the evaluation, we use only rotation shift transformation for evaluation.

python eval.py --mode ood_pre --dataset <DATASET> --model resnet18_imagenet --ood_score CSI --shift_trans_type rotation --print_score --diffusion_upper_offset 0. --diffusion_offset 0. --ood_samples 10 --resize_factor 0.54 --resize_fix --one_class_idx 0 --load_path <MODEL_PATH>

4. Results

In general, our method achieves SOTA in terms of fine-grained anomaly detection tasks. We recommend using 32x32 diffusion_rotation for most tasks, as the following results:

Different transformations

In general, diffusion_rotation performs best.

Dataset	transform	Gaussian	Median	Diffusion
SARS-COV-2	Perm	0.788	0.826	0.841
	Rotate	0.847	0.837	0.851
Kvasir	Perm	0.712	0.663	0.840
	Rotate	0.739	0.687	0.860
Retinal OCT	Perm	0.754	0.747	0.890
	Rotate	0.895	0.876	0.919
APTOS 2019	Perm	0.942	0.929	0.926
	Rotate	0.922	0.918	0.934

Resolutions

Higher feature dissolving resolution will dramatically increase the processing time, while hardly bringing up the detection performances.

Dataset	resolution	MACs (G)	AUC
SARS-COV-2	32	2.33	0.851
	64	3.84	0.803
	128	9.90	0.807
Kvasir	32	2.33	0.860
	64	3.84	0.721
	128	9.90	0.730
Retinal OCT	32	2.33	0.919
	64	3.84	0.922
	128	9.90	0.930
APTOS 2019	32	2.33	0.934
	64	3.84	0.937
	128	9.90	0.905

Citation

@misc{2302.14696,
  Author = {Jian Shi and Pengyi Zhang and Ni Zhang and Hakim Ghazzai and Peter Wonka},
  Title = {Dissolving Is Amplifying: Towards Fine-Grained Anomaly Detection},
  Year = {2023},
  Eprint = {arXiv:2302.14696},
}