This README describes how to reproduce results for the paper "Towards Reliable AI: Bias Identification, Prevention and Quality Improvement in Otoscopic Images".
Download all the three public datasets.
Rename their folder names as 'Chile', 'Ohio', and 'Turkey', respectively.
Put all three datasets in folder DATA_MAIN_DIR=../data/eardrum_public_data
.
data_bias_evaluation_framework/metadata/metadata.csv
is a dataframe including the relative path, source, class and binary class for each image. To reproduce this dataframe, run data_bias_evaluation_framework/data_bias_evaluation_framework/prepare_dataset/generate_metadata.ipynb
.
DATA_MAIN_DIR
├── Chile
│ ├── Testing
│ │ ├── Chronic otitis media
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Earwax plug
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Myringosclerosis
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Normal
│ │ │ ├── Image1
│ │ │ ├── ...
│ ├── Training-validation
│ │ ├── Chronic otitis media
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Earwax plug
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Myringosclerosis
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Normal
│ │ │ ├── Image1
│ │ │ ├── ...
├── Ohio
│ ├── Tube_Effusion_Normal - 11_7_19
│ │ ├── Effusion
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Normal
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── Tube
│ │ │ ├── Image1
│ │ │ ├── ...
├── Turkey
│ ├── abnormal
│ │ ├── aom
│ │ │ ├── Test_aom
│ │ │ │ ├── Image1
│ │ │ │ ├── ...
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── csom
│ │ │ ├── Test_cosm
│ │ │ │ ├── Image1
│ │ │ │ ├── ...
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── earVentilationTube
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── earwax
│ │ │ ├── Test_earwax
│ │ │ │ ├── Image1
│ │ │ │ ├── ...
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── foreignObjectEar
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── otitisexterna
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── pseudoMembranes
│ │ │ ├── Image1
│ │ │ ├── ...
│ │ ├── tympanoskleros
│ │ │ ├── Image1
│ │ │ ├── ...
│ ├── normal
│ │ ├── Test_normal
│ │ │ ├── Images1
│ │ │ ├── ...
│ │ ├── Image1
│ │ ├── ...
Our python environment is summarized in requirements.txt
. Note that CUDA version = 11.0 for our system, you might want to adjust the file to match your CUDA version.
To repropduce the Counterfactual Experiment I's results on the three public datasets, run the following commands to train a model on the Eclipsed Dataset.
cd data_bias_evaluation_framework/train_model
python run_binary_classification_gen.py --model_name 'vit_b_16_384' --num_epoch 100 --eclipse --eclipse_extent 1.0 --cudaID 0 --elastic_tf --lr 0.01
Run the following commands to train a model on the original dataset (Eclipsed Extent = 0).
cd data_bias_evaluation_framework/train_model
python run_binary_classification_gen.py --model_name 'vit_b_16_384' --num_epoch 100 --cudaID 0 --elastic_tf --lr 0.01
Run the notebook data_bias_evaluation_framework/post_training/summarize_result.ipynb
to reproduce the visualization.
To repropduce Counterfactual Experiment II's results, run the notebook data_bias_evaluation_framework/train_model/logistic_regression.ipynb
Run the notebook data_bias_evaluation_framework/post_training/qualitative_databias_assess.ipynb
Note that the feature embeddings were extracted from models stored in data_bias_evaluation_framework/experiment/vit_b_16_384_False_0.0_False_32_1234_100_True_False_0.05_0.01_0_0.9
. To train your own models, run the following commands
cd data_bias_evaluation_framework/train_model
python run_binary_classification_cv.py --model_name 'vit_b_16_384' --num_epoch 100 --cudaID 0 --elastic_tf --lr 0.01
This part of the paper was based on a private dataset. To prepare your own dataset, use /active_labeling/prepare_dataset/prepare_hierch_dataset.py
. The multitask model is available at /active_labeling/models/models_hierch.py
. We used the function train_model_multitask in /active_labeling/train_model/train_model.py
to train the multitask model.