Datasets

June 22, 2026 · View on GitHub

Caribou Aerial Survey Dataset

Point-annotated 512×512 px aerial image patches for caribou detection and counting from overhead survey imagery. This dataset accompanies the OWL paper and enables reproducible evaluation of point-based object detectors on aerial wildlife imagery.

➜ Download the dataset and model weights from Zenodo

!!! tip "Try it in one command" The Caribou Demo downloads the patches + weights, runs OWL-C inference (GPU or CPU), and visualizes the predictions (tools/demo_caribou.sh). To run and compare all four pretrained models on caribou, use tools/demo_owl_models.sh.


Overview

SplitSource herdYearPatchesAnnotatedBackgroundPoint annotations
TrainPorcupine Caribou Herd (PCH), Alaska201723,51718,3225,195273,268
TestCentral Arctic Herd (CAH), Alaska20222,6071,85275512,456

This is a strict cross-herd and cross-temporal generalization benchmark: models trained on PCH 2017 are evaluated on CAH 2022 without any per-deployment retraining.


Contents

FileDescription
train.zip23,517 training patches (512×512 PNG) + gt.csv (273,268 annotations)
test.zip2,607 test patches (512×512 PNG) + gt.csv (12,456 annotations)
Caribou-OWL-C.pthCaribou-specific OWL-C (DLA-34, epoch 14, val F1 = 0.937); reproduces the F1 = 0.965 headline below
OWL-C.pthOWL-C general overhead-benchmark model (DLA-34 detection branch)
OWL-T.pthOWL-T general overhead-benchmark model (DLA-34 + Swin multi-scale residual)
OWL-D.pthOWL-D general overhead-benchmark model (DINOv3 ViT-H+/16 + DPT decoder)
README.mdFull dataset documentation, annotation format, and benchmark results

The OWL-C / OWL-T / OWL-D checkpoints are trained on public overhead datasets, not caribou; see the Model Zoo for details.


Annotation format

Each split contains a gt.csv with point annotations in the following format:

ColumnDescription
imagesPatch filename (e.g., patch_00001.png)
xHorizontal pixel coordinate of the animal center
yVertical pixel coordinate of the animal center

This format is directly compatible with the animaloc training package used in this repository. See Training, Evaluation, and Inference for usage.


Benchmark results

The pre-trained Caribou-OWL-C.pth weights reproduce the paper headline on the test split:

MetricValue
F1 score (τ = 20 px, c* = 0.20)0.965
Precision0.975
Recall0.955

!!! note All OWL pretrained checkpoints are now released — the caribou-specific Caribou-OWL-C.pth plus the three general overhead-benchmark models (OWL-C.pth, OWL-T.pth, OWL-D.pth). The general models are trained on public overhead datasets, not caribou, so evaluating them on the caribou test set is a zero-shot, cross-domain check (expect lower numbers than the in-domain Caribou-OWL-C). The Caribou Demo runs and compares all four.


Citation

If you use this dataset or code, please cite:

@article{chacon2026overhead,
  title={Overhead Wildlife Locator (OWL): Benchmarking Weakly Supervised Learning for Aerial Wildlife Surveys},
  author={Chac{\'o}n, Isai Daniel and Miao, Zhongqi and Demuro, Bruno and Robinson, Caleb and Dodhia, Rahul and Otarashvili, Lasha and Holmberg, Jason and Larsen, Kirk and Frederick, Howard and Pamperin, Nathan J and others},
  journal={arXiv preprint arXiv:2606.13911},
  year={2026}
}