Benchmarking PySceneDetect

May 28, 2026 ยท View on GitHub

Benchmarks PySceneDetect's detection accuracy and latency against public shot-boundary-detection corpora. Scoring follows the TRECVID-SBD convention (greedy 1-to-1 nearest-neighbor matching with a configurable frame tolerance for hard cuts; point-in-interval matching for fade transitions; mean absolute frame offset on matched events) so numbers are comparable to published SBD results.

Supported datasets:

  • BBC Planet Earth: 11 long-form broadcast clips; hard cuts only
  • AutoShot: Short-form web clips; hard cuts only
  • ClipShots: Short-form web clips; hard cuts and typed gradual transitions (fades/dissolves)

Usage

# Single detector x single dataset:
python -m benchmark --detector detect-content --dataset BBC

Pass --help for --dataset-root, --backend, --tolerance, and --out options.

Dataset Download

BBC

# annotations
wget -O BBC/fixed.zip https://zenodo.org/records/14873790/files/fixed.zip
unzip BBC/fixed.zip -d BBC
rm -rf BBC/fixed.zip

# videos
wget -O BBC/videos.zip https://zenodo.org/records/14873790/files/videos.zip
unzip BBC/videos.zip -d BBC
rm -rf BBC/videos.zip

AutoShot

Download AutoShot_test.tar.gz from Google Drive.

tar -zxvf AutoShot_test.tar.gz
rm AutoShot_test.tar.gz

ClipShots

ClipShots is gated behind a dataset request form; direct wget-style download links are not published. See the download instructions to obtain the annotations and videos. The expected on-disk layout is:

ClipShots/
  annotations/{train,test,only_gradual}.json
  video_lists/{train,test,only_gradual}.txt
  videos/*.mp4

The loader defaults to the test split (500 videos). The full corpus is ~46 GB.

Set --dataset-root /path/to/datasets to override. The default dataset location assumes they are all placed in the benchmark folder (e.g. benchmark/BBC, benchmark/AutoShot, benchmark/ClipShots).

Results (defaults)

Generated by scripts/benchmark_defaults.sh at tolerance=0 (strict frame-exact matching). Elapsed is mean wall-clock seconds per video.

BBC

DetectorRecallPrecisionF1Mean s/video
AdaptiveDetector87.1296.5591.5936.12
ContentDetector84.7088.7786.6937.02
HashDetector92.3075.5683.1025.51
HistogramDetector89.8472.0379.9622.29
ThresholdDetector0.060.700.1116.05

AutoShot

DetectorRecallPrecisionF1Mean s/video
AdaptiveDetector70.5977.4673.863.52
ContentDetector63.4976.1969.264.80
HashDetector56.4876.1164.844.14
HistogramDetector63.2753.2357.823.76
ThresholdDetector0.7538.641.473.28

ClipShots (hard cuts)

DetectorRecallPrecisionF1Mean s/video
AdaptiveDetector85.9741.2555.751.81
ContentDetector81.9342.3655.842.52
HashDetector81.3430.1443.981.04
HistogramDetector72.2011.4719.800.71
ThresholdDetector0.080.580.140.64

ClipShots (fades)

DetectorRecallPrecisionF1
AdaptiveDetector13.6598.1223.96
ContentDetector26.0398.0441.14
HashDetector18.7794.5331.33
HistogramDetector69.6781.9975.33
ThresholdDetector5.6999.2410.77

Citations

BBC

@InProceedings{bbc_dataset,
  author    = {Lorenzo Baraldi and Costantino Grana and Rita Cucchiara},
  title     = {A Deep Siamese Network for Scene Detection in Broadcast Videos},
  booktitle = {Proceedings of the 23rd ACM International Conference on Multimedia},
  year      = {2015},
}

AutoShot

@InProceedings{autoshot_dataset,
  author    = {Wentao Zhu and Yufang Huang and Xiufeng Xie and Wenxian Liu and Jincan Deng and Debing Zhang and Zhangyang Wang and Ji Liu},
  title     = {AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
  year      = {2023},
}

ClipShots

@InProceedings{clipshots_dataset,
  author    = {Shitao Tang and Litong Feng and Zhanghui Kuang and Yimin Chen and Wei Zhang},
  title     = {Fast Video Shot Transition Localization with Deep Structured Models},
  booktitle = {Asian Conference on Computer Vision (ACCV)},
  year      = {2018},
}