README.md
February 17, 2026 ยท View on GitHub

D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms
NEWS
- [July 2024] We publicly release source code and pre-trained D-MASTER model weights!
- [Jun 2024] D-MASTER is accepted in MICCAI 2024 Congratulations to all the authors. See you all at MICCAI 2024 under the Moroccan sun!
- [June 2024] We released an arxiv version.. See more details in our updated arxiv!
- [June 2024] We release RSNA-BSD1K Dataset, a bounding box annotated subset of 1000 mammograms from the RSNA Breast Screening Dataset (referred to as RSNA-BSD1K) to support further research in BCDM!
- [May 2024] We release the D-MASTER benchmark.
What is D-MASTER?
D-MASTER is a transformer-based Domain-invariant Mask Annealed Student Teacher Autoencoder Framework for cross-domain breast cancer detection from mammograms (BCDM). It integrates a novel mask-annealing technique and an adaptive confidence refinement module. Unlike traditional pretraining with Mask Autoencoders (MAEs) that leverage massive datasets before fine-tuning on smaller datasets, D-MASTER introduces a novel learnable masking technique for the MAE branch. This technique generates masks of varying complexities, which are then reconstructed by the DefDETR encoder and decoder. By applying this self-supervised task on target images, our approach enables the encoder to acquire domain-invariant features and improve target representations.
๐ฅ Check out our website for more overview!
What is RSBA-BSD1K Data?
RSNA-BSD1K is a bounding box annotated subset of 1,000 mammograms from the RSNA Breast Screening Dataset, designed to support further research in breast cancer detection from mammograms (BCDM). The original RSNA dataset consists of 54,706 screening mammograms, containing 1,000 malignancies from 8,000 patients. From this, we curated RSNA-BSD1K, which includes 1,000 mammograms with 200 malignant cases, annotated at the bounding box level by two expert radiologists.
๐ฅ Since images are from existing RSNA dataset, please contact us for the clinically verified annotations to run experiments. Cheers!
Access benchmark RSNA-BSD1K Dataset
- Structure
- โโ rsna-bsd1k
โโ annotations
โโ instances_full.json
โโ instances_val.json
โโ images
โโ train
โโ val
-
Put the dataset in the
DATA_ROOTfolder. -
Add rsna dataset in datasets/coco_style_dataset.py.
-
Done! You can now use the dataset for training and evaluation.
1. Installation
1.1 Requirements
-
Linux, CUDA >= 11.1, GCC >= 8.4
-
Python >= 3.8
-
torch >= 1.10.1, torchvision >= 0.11.2
-
Other requirements
pip install -r requirements.txt
1.2 Compiling Deformable DETR CUDA operators
cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py
2. Usage
2.1 Data preparation
We provide the 2 benchmarks in our paper:
- city2foggy: cityscapes dataset is used as source domain, and foggy_cityscapes(0.02) is used as target domain.
- sim2city: sim10k dataset is used as source domain, and cityscapes which only record AP of cars is used as target domain.
- city2bdd: cityscapes dataset is used as source domain, and bdd100k-daytime is used as target domain.
You can download the raw data from the official websites: cityscapes, foggy_cityscapes, sim10k, bdd100k. We provide the annotations that are converted into coco style, download from here and organize the datasets and annotations as follows:
[data_root]
โโ inbreast
โโ annotations
โโ instances_train.json
โโ instances_val.json
โโ images
โโ train
โโ val
โโ ddsm
โโ annotations
โโ instances_train.json
โโ instances_val.json
โโ images
โโ train
โโ val
โโ rsna-bsd1k
โโ annotations
โโ instances_full.json
โโ instances_val.json
โโ images
โโ train
โโ val
โโ cityscapes
โโ annotations
โโ cityscapes_train_cocostyle.json
โโ cityscapes_train_caronly_cocostyle.json
โโ cityscapes_val_cocostyle.json
โโ cityscapes_val_caronly_cocostyle.json
โโ leftImg8bit
โโ train
โโ val
โโ foggy_cityscapes
โโ annotations
โโ foggy_cityscapes_train_cocostyle.json
โโ foggy_cityscapes_val_cocostyle.json
โโ leftImg8bit_foggy
โโ train
โโ val
โโ sim10k
โโ annotations
โโ sim10k_train_cocostyle.json
โโ sim10k_val_cocostyle.json
โโ JPEGImages
โโ bdd10k
โโ annotations
โโ bdd100k_daytime_train_cocostyle.json
โโ bdd100k_daytime_val_cocostyle.json
โโ JPEGImages
To use additional datasets, you can edit datasets/coco_style_dataset.py and add key-value pairs to CocoStyleDataset.img_dirs and CocoStyleDataset.anno_files .
2.2 Training and evaluation
As has been discussed in implementation details in the paper, to save computation cost, our method is designed as a three-stage paradigm. We first perform source_only training which is trained standardly by labeled source domain. Then, we perform cross_domain_mae to train the model with MAE branch. Finally, we perform teaching which utilize a teacher-student framework with MAE branch and selective retraining.
For example, for ddsm2inbreast benchmark, first edit the files in configs/def-detr-base/ddsm2inbreast/ to specify your own DATA_ROOT and OUTPUT_DIR, then run:
sh configs/def-detr-base/ddsm2inbreast/source_only.sh
sh configs/def-detr-base/ddsm2inbreast/cross_domain_mae.sh
sh configs/def-detr-base/ddsm2inbreast/teaching.sh
We use tensorboard to record the loss and results. Run the following command to see the curves during training:
tensorboard --logdir=<YOUR/LOG/DIR>
To evaluate the trained model and get the predicted results, run:
sh configs/def-detr-base/city2foggy/evaluation.sh
2.2.1 Inferencing on classification datasets
If the model is adapated on a classification dataset, the predictions produced during inference will be stored in ./outputs/outputs.csv file. To generate predictions set --csv True in the evalution.sh script and run:
sh configs/def-detr-base/mammo/evaluation.sh
The ./outputs/outputs.csv file can be used further for computing the required metrics for the target classification dataset on which the model was adapted. Then Run
python match_id_csv_json.py
Finally Run
python eval_cview_csv.py
This will give you the TN, TP, FN, FP, AUC, and NPV score,
3. Results and Model Parameters
We conduct all experiments with batch size 8 (for source_only stage, 8 labeled samples; for cross_domain_mae and MRT teaching stage, 8 labeled samples and 8 unlabeled samples), on 4 NVIDIA A100 GPUs.
inhouse2inbreast: Inhouse โ INBreast
| backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
|---|---|---|---|---|---|
| resnet50 | 6 | 6 | source_only | 64.3 | logs & weights |
| resnet50 | 6 | 6 | cross_domain_mae | 67.3 | logs & weights |
| resnet50 | 6 | 6 | MRT teaching | 71.9 | logs & weights |
inhouse2rsna: Inhouse โ RSNA-BSD1K
| backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
|---|---|---|---|---|---|
| resnet50 | 6 | 6 | source_only | 53.2 | logs & weights |
| resnet50 | 6 | 6 | cross_domain_mae | 54.6 | logs & weights |
| resnet50 | 6 | 6 | MRT teaching | 58.7 | logs & weights |
ddsm2inhouse: DDSM โ Inhouse
| backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
|---|---|---|---|---|---|
| resnet50 | 6 | 6 | source_only | 29.6 | logs & weights |
| resnet50 | 6 | 6 | cross_domain_mae | 31.1 | logs & weights |
| resnet50 | 6 | 6 | MRT teaching | 33.7 | logs & weights |
ddsm2inbreast: DDSM โ INBreast
| backbone | encoder layers | decoder layers | training stage | R@0.3 | logs & weights |
|---|---|---|---|---|---|
| resnet50 | 6 | 6 | source_only | 29.6 | logs & weights |
| resnet50 | 6 | 6 | cross_domain_mae | 31.1 | logs & weights |
| resnet50 | 6 | 6 | MRT teaching | 33.7 | logs & weights |
city2foggy: cityscapes โ foggy cityscapes(0.02)
| backbone | encoder layers | decoder layers | training stage | AP@50 | logs & weights |
|---|---|---|---|---|---|
| resnet50 | 6 | 6 | source_only | 29.5 | logs & weights |
| resnet50 | 6 | 6 | cross_domain_mae | 35.8 | logs & weights |
| resnet50 | 6 | 6 | MRT teaching | 51.2 | logs & weights |
sim2city: sim10k โ cityscapes(car only)
| backbone | encoder layers | decoder layers | training stage | AP@50 | logs & weights |
|---|---|---|---|---|---|
| resnet50 | 6 | 6 | source_only | 53.2 | logs & weights |
| resnet50 | 6 | 6 | cross_domain_mae | 57.1 | logs & weights |
| resnet50 | 6 | 6 | MRT teaching | 62.0 | logs & weights |
city2bdd: cityscapes โ bdd100k(daytime)
| backbone | encoder layers | decoder layers | training stage | AP@50 | logs & weights |
|---|---|---|---|---|---|
| resnet50 | 6 | 6 | source_only | 29.6 | logs & weights |
| resnet50 | 6 | 6 | cross_domain_mae | 31.1 | logs & weights |
| resnet50 | 6 | 6 | MRT teaching | 33.7 | logs & weights |
4. Citation
This repository is constructed and maintained by Tajamul Ashraf.
If you find our paper or project useful, please cite our work in the following BibTeX:
@article{ashraf2024dmastermaskannealedtransformer,
title={D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms},
author={Tajamul Ashraf and Krithika Rangarajan and Mohit Gambhir and Richa Gabha and Chetan Arora},
year={2024},
eprint={2407.06585},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.06585},
}
Thanks for your attention.