OmniWaterMask Training

January 29, 2026 ยท View on GitHub

Training code for the deep learning model used in OmniWaterMask - a Python library for detecting water bodies in satellite and aerial imagery.

Model Architecture

  • Architecture: U-Net segmentation model
  • Input: 4 bands (Red, Green, Blue, NIR)
  • Output: 2 classes (water / not water)

Training Data

The model is trained on two datasets:

  1. FLAIR Dataset - French aerial imagery patches containing water bodies
  2. S1S2-Water Dataset - Sentinel-1/Sentinel-2 water body dataset

Installation

uv sync

Notebooks

Training

Dataset Preparation

FLAIR:

S1S2-Water:

Training

Key training features:

  • Custom augmentations for remote sensing data (rotation, flip, resampling, random cropping)
  • Dynamic Z-score normalization
  • Distance transform weighted loss for improved boundary detection
  • Gradient accumulation for larger effective batch sizes
  • BF16 mixed precision training

Model Export

Trained models are exported in multiple formats:

  • PyTorch full model (.pth)
  • PyTorch state dict (.pth)
  • Safetensors (.safetensors)

License

MIT License