Title: Causal Unsupervised Semantic Segmentation
August 5, 2025 ยท View on GitHub
Pattern Recognition, Journal
This is pytorch implementation code for realizing the technical part of CAusal Unsupervised Semantic sEgmentation (CAUSE) to improve performance of unsupervised semantic segmentation. This code is further developed by two baseline codes of HP: Leveraging Hidden Positives for Unsupervised Semantic Segmentation accepted in CVPR 2023 and STEGO: Unsupervised Semantic Segmentation by Distilling Feature Correspondences accepted in ICLR 2022.
You can see the following bundle of images in Appendix. Further, we explain concrete implementation beyond the description of the main paper.
๐ Download Visual Quality, Seg Head Parameter, and Concept ClusterBook of CAUSE
You can download the checkpoint files including CAUSE-trained parameters based on DINO, DINOv2, iBOT, MSN, MAE in self-supervised vision transformer framework. If you want to download the pretrained models of DINO in various structures the following CAUSE uses, you can download them in the following links:
| Dataset | Method | Baseline | mIoU(%) | pAcc(%) | Visual Quality | Seg Head Parameter | Concept ClusterBook |
|---|---|---|---|---|---|---|---|
| COCO-Stuff | DINO+CAUSE-MLP | ViT-S/8 | 27.9 | 66.8 | [link] | [link] | [link] |
| COCO-Stuff | DINO+CAUSE-TR | ViT-S/8 | 32.4 | 69.6 | [link] | [link] | [link] |
| COCO-Stuff | DINO+CAUSE-MLP | ViT-S/16 | 25.9 | 66.3 | [link] | [link] | [link] |
| COCO-Stuff | DINO+CAUSE-TR | ViT-S/16 | 33.1 | 70.4 | [link] | [link] | [link] |
| COCO-Stuff | DINO+CAUSE-MLP | ViT-B/8 | 34.3 | 72.8 | [link] | [link] | [link] |
| COCO-Stuff | DINO+CAUSE-TR | ViT-B/8 | 41.9 | 74.9 | [link] | [link] | [link] |
| COCO-Stuff | DINOv2+CAUSE-TR | ViT-B/14 | 45.3 | 78.0 | [link] | [link] | [link] |
| COCO-Stuff | iBOT+CAUSE-TR | ViT-B/16 | 39.5 | 73.8 | [link] | [link] | [link] |
| COCO-Stuff | MSN+CAUSE-TR | ViT-S/16 | 34.1 | 72.1 | [link] | [link] | [link] |
| COCO-Stuff | MAE+CAUSE-TR | ViT-B/16 | 21.5 | 59.1 | [link] | [link] | [link] |
| Dataset | Method | Baseline | mIoU(%) | pAcc(%) | Visual Quality | Seg Head Parameter | Concept ClusterBook |
|---|---|---|---|---|---|---|---|
| Cityscapes | DINO+CAUSE-MLP | ViT-S/8 | 21.7 | 87.7 | [link] | [link] | [link] |
| Cityscapes | DINO+CAUSE-TR | ViT-S/8 | 24.6 | 89.4 | [link] | [link] | [link] |
| Cityscapes | DINO+CAUSE-MLP | ViT-B/8 | 25.7 | 90.3 | [link] | [link] | [link] |
| Cityscapes | DINO+CAUSE-TR | ViT-B/8 | 28.0 | 90.8 | [link] | [link] | [link] |
| Cityscapes | DINOv2+CAUSE-TR | ViT-B/14 | 29.9 | 89.8 | [link] | [link] | [link] |
| Cityscapes | iBOT+CAUSE-TR | ViT-B/16 | 23.0 | 89.1 | [link] | [link] | [link] |
| Cityscapes | MSN+CAUSE-TR | ViT-S/16 | 21.2 | 89.1 | [link] | [link] | [link] |
| Cityscapes | MAE+CAUSE-TR | ViT-B/16 | 12.5 | 82.0 | [link] | [link] | [link] |
| Dataset | Method | Baseline | mIoU(%) | pAcc(%) | Visual Quality | Seg Head Parameter | Concept ClusterBook |
|---|---|---|---|---|---|---|---|
| Pascal VOC | DINO+CAUSE-MLP | ViT-S/8 | 46.0 | - | [link] | [link] | [link] |
| Pascal VOC | DINO+CAUSE-TR | ViT-S/8 | 50.0 | - | [link] | [link] | [link] |
| Pascal VOC | DINO+CAUSE-MLP | ViT-B/8 | 47.9 | - | [link] | [link] | [link] |
| Pascal VOC | DINO+CAUSE-TR | ViT-B/8 | 53.3 | - | [link] | [link] | [link] |
| Pascal VOC | DINOv2+CAUSE-TR | ViT-B/14 | 53.2 | 91.5 | [link] | [link] | [link] |
| Pascal VOC | iBOT+CAUSE-TR | ViT-B/16 | 53.4 | 89.6 | [link] | [link] | [link] |
| Pascal VOC | MSN+CAUSE-TR | ViT-S/16 | 30.2 | 84.2 | [link] | [link] | [link] |
| Pascal VOC | MAE+CAUSE-TR | ViT-B/16 | 25.8 | 83.7 | [link] | [link] | [link] |
| Dataset | Method | Baseline | mIoU(%) | pAcc(%) | Visual Quality | Seg Head Parameter | Concept ClusterBook |
|---|---|---|---|---|---|---|---|
| COCO-81 | DINO+CAUSE-MLP | ViT-S/8 | 19.1 | 78.8 | [link] | [link] | [link] |
| COCO-81 | DINO+CAUSE-TR | ViT-S/8 | 21.2 | 75.2 | [link] | [link] | [link] |
| COCO-171 | DINO+CAUSE-MLP | ViT-S/8 | 10.6 | 44.9 | [link] | [link] | [link] |
| COCO-171 | DINO+CAUSE-TR | ViT-S/8 | 15.2 | 46.6 | [link] | [link] | [link] |
๐ค CAUSE Framework (Top-Level File Directory Layout)
.
โโโ loader
โ โโโ netloader.py # Self-Supervised Pretrained Model Loader & Segmentation Head Loader
โ โโโ dataloader.py # Dataloader Thanks to STEGO [ICLR 2022]
โ
โโโ models # Model Design of Self-Supervised Pretrained: [DINO/DINOv2/iBOT/MAE/MSN]
โ โโโ dinomaevit.py # ViT Structure of DINO and MAE
โ โโโ dinov2vit.py # ViT Structure of DINOv2
โ โโโ ibotvit.py # ViT Structure of iBOT
โ โโโ msnvit.py # ViT Structure of MSN
โ
โโโ modules # Segmentation Head and Its Necessary Function
โ โโโ segment_module.py # [Including Tools with Generating Concept Book and Contrastive Learning
โ โโโ segment.py # [MLP & TR] Including Tools with Generating Concept Book and Contrastive Learning
โ
โโโ utils
โ โโโ utils.py # Utility for auxiliary tools
โ
โโโ test_mlp.py # [MLP] Evaluating Unsupervised Semantic Segmantation Performance (Post-Processing)
โโโ test_tr.py # [TR] Evaluating Unsupervised Semantic Segmantation Performance (Post-Processing)
โ
โโโ requirements.txt
โโโ README.md
๐ How to Run CAUSE?
bash run
In this shell script file, you can see the following script
#!/bin/bash
######################################
# [OPTION] DATASET
# cocostuff27
dataset="cocostuff27"
# cityscapes
# dataset="cityscapes"
# pascalvoc
# dataset="pascalvoc"
# coco-81
# dataset="coco81"
# coco-171
# dataset="coco171"
######################################
######################################
# [OPTION] STRUCTURE
# structure="MLP"
structure="TR"
######################################
######################################
# [OPTION] Self-Supervised Method
# DINO
# ckpt="checkpoint/dino_vit_small_8.pth"
# ckpt="checkpoint/dino_vit_small_16.pth"
ckpt="checkpoint/dino_vit_base_8.pth"
# ckpt="checkpoint/dino_vit_base_16.pth"
# DINOv2
# ckpt="checkpoint/dinov2_vit_base_14.pth"
# iBOT
# ckpt="checkpoint/ibot_vit_base_16.pth"
# MSN
# ckpt="checkpoint/msn_vit_small_16.pth"
# MAE
# ckpt="checkpoint/mae_vit_base_16.pth"
######################################
######################################
# GPU and PORT
test_gpu="0"
port=$(($RANDOM%800+1200))
######################################
######################################
# TEST
if [ "$structure" = "MLP" ]
then
python test_mlp.py --dataset $dataset --ckpt $ckpt --gpu $test_gpu
elif [ "$structure" = "TR" ]
then
python test_tr.py --dataset $dataset --ckpt $ckpt --gpu $test_gpu
fi
######################################
Testing CAUSE
python test_mlp.py # CAUSE-MLP
# or
python test_tr.py # CAUSE-TR
๐ก Environment Settings
- Creating Virtual Environment by Anaconda
conda create -y -n neurips python=3.9
- Installing PyTorch Package in Virtual Envrionment
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
- Installing Pip Package
pip install -r requirements.txt
- [Optional] Removing Conda and PIP Cache if Conda and PIP have been locked by unknown reasons
conda clean -a && pip cache purge
๐ Download Datasets
Available Datasets
Note: Pascal VOC is not necessary to download because dataloader will automatically download in your own dataset path
Try the following scripts
If the above do not work, then download azcopy and follow the below scripts
- azcopy copy "https://marhamilresearch4.blob.core.windows.net/stego-public/pytorch_data/cityscapes.zip" "custom_path" --recursive
- azcopy copy "https://marhamilresearch4.blob.core.windows.net/stego-public/pytorch_data/cocostuff.zip" "custom_path" --recursive
Unzip Datasets
unzip cocostuff.zip && unzip cityscapes.zip