Benchmarks

July 31, 2025 ยท View on GitHub

This project includes a benchmarking suite to measure performance improvements and identify bottlenecks.

Benchmark Models and Images

The benchmarks performed in this project use the same models and images as those included in the YoloDotNet demo projects. This ensures consistency and transparency when evaluating performance metrics across different execution providers.

Running Benchmarks

  • Build the solution in Release mode.
  • Run the YoloDotNet.Benchmarks project.

Note: While it is possible to run benchmarks in Debug mode by uncommenting specific sections, this approach is not recommended for obtaining accurate performance data. Debug mode should be reserved primarily for stepping through and diagnosing the benchmark code.

Benchmark Results

  • Benchmark results are presented as baseline (previous version) and current version to facilitate easy comparison of performance improvements.

  • Please note that benchmark outcomes depend heavily on hardware and environmental factors. To obtain meaningful and consistent results, it is recommended to run benchmarks on the same hardware configuration whenever possible.

Keep in mind that differences in hardware, system load, driver versions, and other factors can influence benchmark performance. Use these results as a general guide rather than absolute metrics.

Hardware Used for Benchmarks

NVIDIA GeForce RTX 4070 Ti GPU

YoloDotNet 3.0 Benchmark Results (Baseline)

BenchmarkDotNet v0.15.2, Windows 11 (10.0.26100.4652/24H2/2024Update/HudsonValley)
Intel Core i7-14700KF 3.40GHz, 1 CPU, 28 logical and 20 physical cores
.NET SDK 9.0.301
  [Host]     : .NET 8.0.17 (8.0.1725.26602), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.17 (8.0.1725.26602), X64 RyuJIT AVX2
CLASSIFICATION (Input image size: 1280x844)
MethodYoloParamMeanErrorStdDevGen0Allocated
ClassificationV8_Cls_CPU2,703.9 us51.32 us50.41 us-32.32 KB
ClassificationV8_Cls_GPU826.8 us15.40 us14.40 us0.976632.32 KB
ClassificationV11_Cls_CPU3,075.7 us61.02 us65.29 us-32.32 KB
ClassificationV11_Cls_GPU1,007.8 us4.72 us4.41 us-32.32 KB
OBJECT DETECTION (input image size: 1280x851)
MethodYoloParamMeanErrorStdDevAllocated
ObjectDetectionV5u_Obj_CPU29.169 ms0.5444 ms0.8791 ms31.85 KB
ObjectDetectionV5u_Obj_GPU4.659 ms0.0313 ms0.0293 ms31.85 KB
ObjectDetectionV8_Obj_CPU33.040 ms0.5668 ms0.4733 ms34.18 KB
ObjectDetectionV8_Obj_GPU5.001 ms0.0321 ms0.0285 ms34.18 KB
ObjectDetectionV9_Obj_CPU38.676 ms0.7520 ms1.2769 ms25.66 KB
ObjectDetectionV9_Obj_GPU6.711 ms0.0330 ms0.0309 ms25.44 KB
ObjectDetectionV10_Obj_CPU30.650 ms0.2735 ms0.2558 ms19.33 KB
ObjectDetectionV10_Obj_GPU5.206 ms0.0458 ms0.0428 ms19.19 KB
ObjectDetectionV11_Obj_CPU30.662 ms0.3778 ms0.3534 ms30.84 KB
ObjectDetectionV11_Obj_GPU4.948 ms0.0214 ms0.0190 ms30.7 KB
ObjectDetectionV12_Obj_CPU30.601 ms0.3743 ms0.3501 ms30.84 KB
ObjectDetectionV12_Obj_GPU5.050 ms0.0319 ms0.0266 ms30.7 KB
ORIENTED OBJECT DETECTION (OBB) (input image size: 1280x720)
MethodYoloParamMeanErrorStdDevAllocated
ObbDetectionV8_Obb_CPU89.218 ms0.9932 ms0.9291 ms8.15 KB
ObbDetectionV8_Obb_GPU9.492 ms0.0575 ms0.0538 ms8.15 KB
ObbDetectionV11_Obb_CPU81.068 ms0.5582 ms0.4948 ms7.9 KB
ObbDetectionV11_Obb_GPU9.622 ms0.0508 ms0.0475 ms7.9 KB
POSE ESTIMATION (input image size: 1280x720)
MethodYoloParamMeanErrorStdDevAllocated
PoseEstimationV8_Pos_CPU34.846 ms0.2279 ms0.2132 ms23.81 KB
PoseEstimationV8_Pos_GPU4.793 ms0.0258 ms0.0215 ms23.81 KB
PoseEstimationV11_Pos_CPU31.511 ms0.3839 ms0.3591 ms23.69 KB
PoseEstimationV11_Pos_GPU5.662 ms0.1122 ms0.1049 ms23.69 KB
SEGMENTATION (input image size: 1280x853)
MethodYoloParamMeanErrorStdDevMedianGen0Allocated
SegmentationV8_Seg_CPU52.52 ms1.015 ms0.949 ms52.43 ms-547.73 KB
SegmentationV8_Seg_GPU10.21 ms0.020 ms0.017 ms10.21 ms31.2500547.73 KB
SegmentationV11_Seg_CPU50.20 ms1.088 ms3.207 ms49.11 ms-533.65 KB
SegmentationV11_Seg_GPU10.53 ms0.213 ms0.306 ms10.69 ms31.2500533.65 KB

YoloDotNet 3.1.1 Benchmark Results (Current version)

BenchmarkDotNet v0.15.2, Windows 11 (10.0.26100.4652/24H2/2024Update/HudsonValley)
Intel Core i7-14700KF 3.40GHz, 1 CPU, 28 logical and 20 physical cores
.NET SDK 9.0.301
  [Host]     : .NET 8.0.17 (8.0.1725.26602), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.17 (8.0.1725.26602), X64 RyuJIT AVX2
CLASSIFICATION (Input image size: 1280x844)
MethodYoloParamMeanErrorStdDevGen0Allocated
ClassificationV8_Cls_CPU2,662.5 us37.55 us35.12 us-32.32 KB
ClassificationV8_Cls_GPU831.4 us14.87 us17.13 us0.976632.32 KB
ClassificationV8_Cls_TRT32653.0 us10.85 us10.15 us0.976632.32 KB
ClassificationV8_Cls_TRT16459.6 us3.97 us3.31 us1.464832.32 KB
ClassificationV8_Cls_TRT8411.7 us3.54 us3.31 us1.464832.32 KB
ClassificationV11_Cls_CPU3,062.4 us56.81 us53.14 us-32.32 KB
ClassificationV11_Cls_GPU1,173.4 us41.59 us122.62 us-32.32 KB
ClassificationV11_Cls_TRT32765.1 us14.64 us16.86 us0.976632.32 KB
ClassificationV11_Cls_TRT16547.5 us4.97 us4.65 us0.976632.32 KB
ClassificationV11_Cls_TRT8485.3 us6.96 us6.17 us1.464832.32 KB
OBJECT DETECTION (input image size: 1280x851)
MethodYoloParamMeanErrorStdDevMedianAllocated
ObjectDetectionV5u_Obj_CPU29.441 ms0.5508 ms0.5152 ms29.324 ms31.85 KB
ObjectDetectionV5u_Obj_GPU4.696 ms0.0214 ms0.0189 ms4.690 ms31.85 KB
ObjectDetectionV5u_Obj_TRT323.899 ms0.0762 ms0.0712 ms3.931 ms31.85 KB
ObjectDetectionV5u_Obj_TRT163.130 ms0.0126 ms0.0111 ms3.131 ms31.73 KB
ObjectDetectionV5u_Obj_TRT82.776 ms0.0104 ms0.0097 ms2.775 ms29.82 KB
ObjectDetectionV8_Obj_CPU33.898 ms0.2453 ms0.2294 ms33.935 ms34.18 KB
ObjectDetectionV8_Obj_GPU5.350 ms0.0133 ms0.0118 ms5.349 ms34.18 KB
ObjectDetectionV8_Obj_TRT324.061 ms0.0086 ms0.0072 ms4.061 ms34.08 KB
ObjectDetectionV8_Obj_TRT162.965 ms0.0138 ms0.0115 ms2.964 ms34.08 KB
ObjectDetectionV8_Obj_TRT82.739 ms0.0150 ms0.0141 ms2.742 ms35.28 KB
ObjectDetectionV9_Obj_CPU38.662 ms0.5644 ms0.5003 ms38.689 ms25.66 KB
ObjectDetectionV9_Obj_GPU8.705 ms0.1611 ms0.1507 ms8.765 ms25.44 KB
ObjectDetectionV9_Obj_TRT325.594 ms0.0333 ms0.0312 ms5.592 ms25.44 KB
ObjectDetectionV9_Obj_TRT164.195 ms0.0388 ms0.0344 ms4.198 ms25.89 KB
ObjectDetectionV9_Obj_TRT83.947 ms0.0390 ms0.0365 ms3.946 ms24.01 KB
ObjectDetectionV10_Obj_CPU31.415 ms0.2898 ms0.2710 ms31.383 ms23.16 KB
ObjectDetectionV10_Obj_GPU4.586 ms0.0534 ms0.0525 ms4.581 ms23.16 KB
ObjectDetectionV10_Obj_TRT323.233 ms0.0109 ms0.0102 ms3.232 ms23.16 KB
ObjectDetectionV10_Obj_TRT162.766 ms0.0414 ms0.0387 ms2.774 ms23.16 KB
ObjectDetectionV10_Obj_TRT82.551 ms0.0147 ms0.0130 ms2.547 ms23.51 KB
ObjectDetectionV11_Obj_CPU31.552 ms0.6212 ms0.9106 ms31.419 ms30.7 KB
ObjectDetectionV11_Obj_GPU5.827 ms0.0214 ms0.0179 ms5.831 ms30.7 KB
ObjectDetectionV11_Obj_TRT324.200 ms0.0710 ms0.0664 ms4.216 ms30.7 KB
ObjectDetectionV11_Obj_TRT163.264 ms0.0276 ms0.0258 ms3.272 ms30.57 KB
ObjectDetectionV11_Obj_TRT83.125 ms0.0623 ms0.1154 ms3.179 ms30.32 KB
ObjectDetectionV12_Obj_CPU36.709 ms0.6856 ms0.8914 ms36.489 ms32.82 KB
ObjectDetectionV12_Obj_GPU7.021 ms0.0500 ms0.0467 ms7.028 ms32.51 KB
ObjectDetectionV12_Obj_TRT324.864 ms0.0159 ms0.0149 ms4.862 ms32.51 KB
ObjectDetectionV12_Obj_TRT163.618 ms0.0454 ms0.0379 ms3.610 ms32.26 KB
ObjectDetectionV12_Obj_TRT83.644 ms0.0262 ms0.0219 ms3.639 ms31.1 KB
ORIENTED OBJECT DETECTION (OBB) (input image size: 1280x720)
MethodYoloParamMeanErrorStdDevAllocated
ObbDetectionV8_Obb_CPU92.457 ms1.7035 ms2.7019 ms8.15 KB
ObbDetectionV8_Obb_GPU9.655 ms0.0450 ms0.0376 ms8.15 KB
ObbDetectionV8_Obb_TRT327.643 ms0.0151 ms0.0126 ms8.15 KB
ObbDetectionV8_Obb_TRT165.520 ms0.0370 ms0.0346 ms8.15 KB
ObbDetectionV8_Obb_TRT85.041 ms0.0191 ms0.0169 ms8.27 KB
ObbDetectionV11_Obb_CPU86.191 ms1.7069 ms2.4479 ms7.9 KB
ObbDetectionV11_Obb_GPU9.537 ms0.0300 ms0.0266 ms7.9 KB
ObbDetectionV11_Obb_TRT327.934 ms0.0429 ms0.0401 ms7.9 KB
ObbDetectionV11_Obb_TRT165.736 ms0.0166 ms0.0147 ms7.9 KB
ObbDetectionV11_Obb_TRT85.027 ms0.0161 ms0.0142 ms8.26 KB
POSE ESTIMATION (input image size: 1280x720)
MethodYoloParamMeanErrorStdDevAllocated
PoseEstimationV8_Pos_CPU35.362 ms0.7067 ms0.9434 ms23.81 KB
PoseEstimationV8_Pos_GPU5.246 ms0.0200 ms0.0167 ms23.81 KB
PoseEstimationV8_Pos_TRT323.511 ms0.0535 ms0.0446 ms23.81 KB
PoseEstimationV8_Pos_TRT162.424 ms0.0152 ms0.0135 ms23.81 KB
PoseEstimationV8_Pos_TRT82.145 ms0.0215 ms0.0201 ms21.84 KB
PoseEstimationV11_Pos_CPU31.776 ms0.6348 ms0.7796 ms23.69 KB
PoseEstimationV11_Pos_GPU4.860 ms0.0147 ms0.0137 ms23.69 KB
PoseEstimationV11_Pos_TRT323.540 ms0.0148 ms0.0138 ms23.69 KB
PoseEstimationV11_Pos_TRT162.853 ms0.0260 ms0.0230 ms23.81 KB
PoseEstimationV11_Pos_TRT82.259 ms0.0115 ms0.0204 ms24.44 KB
SEGMENTATION (input image size: 1280x853)
MethodYoloParamMeanErrorStdDevAllocated
SegmentationV8_Seg_CPU47.793 ms0.7903 ms0.7392 ms84.92 KB
SegmentationV8_Seg_GPU7.373 ms0.1047 ms0.0979 ms84.92 KB
SegmentationV8_Seg_TRT326.259 ms0.0937 ms0.0876 ms84.92 KB
SegmentationV8_Seg_TRT164.886 ms0.0505 ms0.0519 ms84.89 KB
SegmentationV11_Seg_CPU44.502 ms0.8490 ms0.7941 ms79.53 KB
SegmentationV11_Seg_GPU7.805 ms0.1083 ms0.0904 ms79.53 KB
SegmentationV11_Seg_TRT326.363 ms0.0287 ms0.0255 ms79.53 KB
SegmentationV11_Seg_TRT165.315 ms0.1049 ms0.1571 ms79.58 KB

Note: Segmentation running on TensorRT with INT8 precision is not supported.