[ICCV 2025] Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation
July 31, 2025 · View on GitHub
Installation & Environment Setup
Clone the repository:
git clone --recursive https://github.com/anonymouse-xzrptkvyqc/DepthForge.git
Follow these steps to set up your environment:
conda create -n depthforge python=3.11 -y
conda activate depthforge
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 #2.6.0
pip install -U openmim
mim install mmengine
#install mmcv
git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
pip install -r requirements/optional.txt
pip install -e . -v
pip install mmsegmentation
pip install mmdet
pip install xformers=='0.0.30' # optional for DINOv2
pip install -r requirements.txt
pip install future tensorboard
and please read Initial_DA
Dataset Preparation
Prepare the datasets by converting them to the required formats. Run the following commands:
cd DepthForge
mkdir data
# Convert GTA dataset (source domain)
python tools/convert_datasets/gta.py data/gta
# Prepare Cityscapes dataset
python tools/convert_datasets/cityscapes.py data/cityscapes
# Convert Mapillary to Cityscapes format (training data)
python tools/convert_datasets/mapillary2cityscape.py data/mapillary data/mapillary/cityscapes_trainIdLabel --train_id
# Resize Mapillary validation images to Cityscapes format
python tools/convert_datasets/mapillary_resize.py data/mapillary/validation/images data/mapillary/cityscapes_trainIdLabel/val/label data/mapillary/half/val_img data/mapillary/half/val_label
The final folder structure should look like this:
DepthForge
├── ...
├── checkpoints
│ ├── dinov2_vitl14_pretrain.pth
│ ├── depth_anything_v2_vitl.pth
│ ├── dinov2_converted.pth
├── data
│ ├── cityscapes
│ │ ├── leftImg8bit
│ │ │ ├── train
│ │ │ ├── val
│ │ ├── gtFine
│ │ │ ├── train
│ │ │ ├── val
│ ├── bdd100k
│ │ ├── images
│ │ │ ├── 10k
│ │ │ │ ├── train
│ │ │ │ ├── val
│ │ ├── labels
│ │ │ ├── sem_seg
│ │ │ │ ├── masks
│ │ │ │ │ ├── train
│ │ │ │ │ ├── val
│ ├── mapillary
│ │ ├── training
│ │ ├── cityscapes_trainIdLabel
│ │ ├── half
│ │ │ ├── val_img
│ │ │ ├── val_label
│ ├── gta
│ │ ├── images
│ │ ├── labels
├── ├── adac
│ │ ├── gt
│ │ │ ├── fog
│ │ │ ├── night
│ │ │ ├── rain
│ │ │ ├── snow
│ │ ├── rgb_anon
│ │ │ ├── fog
│ │ │ ├── night
│ │ │ ├── rain
│ │ │ ├── snow
├── ...
Pre-trained Weights & Dataset Downloads
Download: Download the pre-trained weights for testing from facebookresearch. Ensure the file name remains unchanged and place it in the project directory. You can also download the DepthAnything weights from DepthAnything GitHub.
Convert:
Convert the pre-trained weights for training or evaluation by running:
python tools/convert_models/convert_dinov2_depth.py checkpoints/dinov2_vitl14_pretrain.pth checkpoints/depth_anything_v2_vitl.pth checkpoints/dinov2_converted_depth.pth
``$
\text{Optional}: \text{Converting} \text{for} 1024 \times 1024 \text{Resolution}
$``bash
python tools/convert_models/convert_dinov2_depth.py checkpoints/dinov2_vitl14_pretrain.pth checkpoints/depth_anything_v2_vitl.pth checkpoints/dinov2_converted_depth_1024x1024.pth --height 1024 --width 1024
Convert the pre-trained weights for DepthForge V2 training or evaluation by running:
python tools/convert_models/convert_dinov2_depthv2.py checkpoints/dinov2_vitl14_pretrain.pth checkpoints/promptda_vitl.ckpt checkpoints/dinov2_converted_depthv2.pth
``$
\text{Optional}: \text{DepthForge} \text{V2} \text{Conversion} \text{for} 1024 \times 1024 \text{Resolution}
$``bash
python tools/convert_models/convert_dinov2_depthv2.py checkpoints/dinov2_vitl14_pretrain.pth checkpoints/promptda_vitl.ckpt checkpoints/dinov2_converted_depth_1024x1024.pth
Training
Use the following commands to start training with different configurations. If you need to resume training from a checkpoint, simply append --resume to the command.
Tips: If resuming training appears to hang or shows no response for a long time, please refer to this issue for potential solutions.
-
Cityscapes → BDD100K + Mapillary + ADAC (fog, night, rain, snow):
python tools/train.py configs/dinov2/depthforge_dinov2_mask2former_512x512_bs1x4_citys.py # To resume training, use: # python tools/train.py configs/dinov2/depthforge_dinov2_mask2former_512x512_bs1x4_citys.py --resume -
GTAV → BDD100K + Mapillary + Cityscapes:
python tools/train.py configs/dinov2/depthforge_dinov2_mask2former_512x512_bs1x4.py # To resume training, use: # python tools/train.py configs/dinov2/depthforge_dinov2_mask2former_512x512_bs1x4.py --resume
For the updated DepthForge V2 architecture, use these commands:
-
Cityscapes Configuration (DepthForge V2):
python tools/train.py configs/dinov2/depthforgev2_dinov2_mask2former_512x512_bs1x4_citys.py # To resume training, use: # python tools/train.py configs/dinov2/depthforgev2_dinov2_mask2former_512x512_bs1x4_citys.py --resume -
GTAV Configuration (DepthForge V2):
python tools/train.py configs/dinov2/depthforgev2_dinov2_mask2former_512x512_bs1x4.py # To resume training, use: # python tools/train.py configs/dinov2/depthforgev2_dinov2_mask2former_512x512_bs1x4.py --resume
Evaluation
To evaluate a trained model, replace <DepthForge model>.pth with your model file and run the corresponding command. The backbone checkpoint checkpoints/dinov2_converted.pth is used in all evaluations:
-
Evaluation with GTAV-based Configuration:
python tools/test.py configs/dinov2/depthforge_dinov2_mask2former_512x512_bs1x4.py <DepthForge model>.pth --backbone checkpoints/dinov2_converted_depth.pth -
Evaluation with Cityscapes-based Configuration:
python tools/test.py configs/dinov2/depthforge_dinov2_mask2former_512x512_bs1x4_citys.py <DepthForge model>.pth --backbone checkpoints/dinov2_converted_depth.pth -
Evaluation with DepthForge V2 Cityscapes Configuration:
python tools/test.py configs/dinov2/depthforgev2_dinov2_mask2former_512x512_bs1x4_citys.py <DepthForge model>.pth --backbone checkpoints/dinov2_converted_depthv2.pth -
Evaluation with DepthForge V2 GTAV Configuration:
python tools/test.py configs/dinov2/depthforgev2_dinov2_mask2former_512x512_bs1x4.py <DepthForge model>.pth --backbone checkpoints/dinov2_converted_depthv2.pth
Acknowledgment
Our implementation is mainly based on following repositories. Thanks for their authors.