AIIA DNN Benchmark Overview

April 2, 2020 · View on GitHub

AIIA DNN Benchmark Overview

The goal of the alliance is provide selection reference for application companies, and provide third-party evaluation results for chip companies.

The goal of AIIA DNN benchmarks is to objectively reflect the current state of AI accelerator capabilities, and all metrics are designed to provide an objective comparison dimension.

We follow the principle of continuous iteration of the version, continuous enrichment of the scene, and continues to improve the AI chip type, and finally form a evaluation environment for the training and inference including the terminal and the cloud.

Evaluation & Results

Edge / Inference

How To Use

This is a example of image classification application powered by AIIA. Please feel free to try them on your device.

1. Android Image Classification

​ This App based on the TensorFlow Lite engine can classify Images from your Devices.

​ Please download resources from App Resources Hub (psw: k04t)

​ Building in Android Studio with TensorFlow Lite AAR from JCenter

​ Import resource files into the device

​ Also refer to the TFLITE Models

adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files
adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files/images
adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files/models
adb shell mkdir /sdcard/Android/data/com.xintongyuan.aibench/files/models/tflite

adb push ./images/. /sdcard/Android/data/com.xintongyuan.aibench/files/images/
adb push ./tflite/. /sdcard/Android/data/com.xintongyuan.aibench/files/models/tflite/

2. Adding a model to run on existing architecture

​ Create a model class and inherit ImageClassifierTF

​ Dynamic binding in the main program

​ Please refer to the TensorFlow Lite example.

3. Adding a new AI frameworks

​ AIBench supports several deep learning frameworks ( SNPE, HIAI,TENGINE and TensorFlow Lite) currently, which may require the following dependencies:

​ you need to download the SNPE, HIAI, TENGINE, TensorFlow Lite, refer to the Demo and API.

​ Other content will be continuously updated.

Five typical application scenario

Test1: Object_Classification

  • Neural Network: Mobilenetv2 / Resnet101 / VGG16 / Inceptionv3
  • Image Resolution: 224 x 224 px |299 x 299 px
  • Metrics: fps / top1 / top5
  • Dataset: ImageNet (1k frames)

Test2: Object Detection

  • Neural Network: ssd_mobilenetv1 / ssd_mobilenetv2 / ssd_vgg16
  • Image Resolution: 300 x 300 px
  • Metrics: fps / mAP / mIoU
  • Dataset: PASCAL VOC2012 (1k frames)

Test3: Image_Super_Resolution

  • Neural Network: vdsr
  • Image Resolution: 256 x 256 px
  • Metrics: fps / PSNR(dB)
  • Dataset: PASCAL VOC2012 (1k frames)

Test4: Image_Segmentation

  • Neural Network: fcn
  • Image Resolution: 224 x 224 px
  • Metrics: fps / mIoU
  • Dataset: PASCAL VOC2012 (1k frames)

Test5: Face_Recognition

  • Neural Network: vgg16
  • Image Resolution: 224 x 224 px
  • Metrics: fps / Accuracy
  • Dataset: LFW (1k frames)

Benchmark Results

INT8 Inference

ProductPlatformDeviceFrameworkSystemTest1: Object_Classification                                Test2: Object_Detection                     Test3: Image_Super_ResolutionTest4: Image_SegmentationTest5: Face_Recognition
  mobilenet_v2  resnet101   vgg16  inception_v3 ssd_mobilenetv1 ssd_mobilenetv2  ssd_vgg16   vdsr    fcn   vgg16
 FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5FPS mAP mIoU FPS mAP mIoU FPS mAP mIoU FPS PSNR(dB) FPS mAP mIoU FPS Accuracy
Huawei_Mate_20kirin_980NPUHIAIAndroid101.90 71.3% 88.3%43.78 71.9% 88.4%32.38 64.3% 85%58.32 75.8% 91.5%65.68 0.84 0.83 52.39 0.55 0.80 14.06 0.89 0.79 12.42 24.92  -  -  - -   -
ROC_RK3399_PCCortexA72_x_2 CortexA53_x_4CPUTENGINEAndroid17.41 73.30% 91.30%1.94 75.1% 93.1%1.115 68.2% 89.4%2.2  77.5% 93.5% -   -  - -   -  - -   -  - -    -  -  -  - -   -

FLOAT16 Inference

ProductPlatformDeviceFrameworkSystemTest1: Object_Classification                              Test2: Object_Detection                     Test3: Image_Super_ResolutionTest4: Image_SegmentationTest5: Face_Recognition
 mobilenet_v2  resnet101   vgg16  inception_v3 ssd_mobilenetv1 ssd_mobilenetv2  ssd_vgg16   vdsr   fcn   vgg16
FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5 FPS TOP1 TOP5FPS mAP mIoU FPS mAP mIoU FPS mAP mIoU FPS PSNR(dB) FPS mAP mIoU FPS Accuracy
Huawei_Mate_20kirin_980NPUHIAIAndroid54.2 70.7% 88.2%21.98 72.3% 89.2%13.53 66.1% 85.2%32.93 75.7% 92.3%35 0.86 0.84 29.97 0.62 0.78 7.276 0.96 0.84 7.64 24.92 1.39 -  - -   -
Cloud / Inference

Environment requirement

In order to follow the objective and fair principle in the AI chip evaluation process, the tested party is required to perform and submit a test report during the self-test according to the following requirements.

  1. Hardware environment requirements
No.Hardwarerequirements
1Computing ConfigurationSingle node & single card
2CPUIntel(R) Xeon(R) Silver 4114 CPU @2.20GHz
3Memory64G DDR4
4Storage512G SSD
  1. Software environment requirements
No.Optionrequirements
1Test data setILSVRC2015 validation on ImageNet (50k frames )
2application scenario
(Including but not limited to other scenarios)
Object_Classification
3Neural Network
(Including but not limited to other models)
VGG16/Resnet50/Resnet152/MobileNet_v1 (Offered by AIIA)
4Acceleration frameworkAdapt to the AI card
5MetricsLatency Accuracy Throughput Power
Computing power per watt(frame/sec/w)
The calculation of all test indicators is based on the test data set
and can be calculated in multiple scripts
  1. Procedure requirements
No.Optionrequirements
1Pre-processingStandardize with z-score (non-crop)
2Batch size1/2/4/8/16/32/64/128
3Inference latencyInference time without pre-processing and post-processing
4PowerAverage power during inference, excluding power of other peripheral modules
5Program running sequence--->Task initialization (quantization model, loading model)
--->Pre-processing
--->Start monitoring power
---> Start the timer
---> Inference
--->End of time
--->End of power monitoring
---> post-processing
--->Metrics output
6Log format###################
processor_name:
test_name:
model_name:
batch size:
power:
latency:(ms/batch)
throughput:(batch size/latency*1000)
top1:
top5:
###################
  1. Sample results

+---------------------------------------------------------------------------------------+
|                                 Resnet50(INT8)                                      |
+---------------------------------------------------------------------------------------+
| top1/top5 | batch size | Latency(ms) | Throughput | Power(w) | 每瓦算力 (/frame/sec/w) |
|-----------|---------------------------------------------------------------------------|
|           | 1          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 2          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 4          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 8          |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 16         |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 32         |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 64         |             |            |          |                        |
|           |---------------------------------------------------------------------------|
|           | 128        |             |            |          |                        |
+---------------------------------------------------------------------------------------+

License

Apache License 2.0.