coremlmodelcprofling
August 9, 2024 ยท View on GitHub
per op profiling in using Core ML MLComputePlan
In iOS 17.4 / MacOS 15.4 or later. It's possible to get per-op profiling with MLComputePlan's estimatedCostOfMLProgramOperation: method.
- prepare a compiled Core ML model (with
xcrun coremlc compile foo.mlpackage /tmp/) coreml_profiing /tmp/foo.mlmodelc
E.g., I got
2024-06-13 16:59:10.628 coreml_profiling[67391:1771143] the mlmodelc directory: file:///tmp/MobilenetEdgeTPU.mlmodelc/
2024-06-13 16:59:10.784 coreml_profiling[67391:1771143] ML Program
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.mul, device <MLCPUComputeDevice: 0x6000036e8170>, cost 4.6175%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.add, device <MLCPUComputeDevice: 0x6000036e8170>, cost 3.6490%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.cast, device <MLCPUComputeDevice: 0x6000036e8170>, cost 2.2767%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.conv, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 0.5662%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios16.relu, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 2.4627%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.conv, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 4.5811%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.conv, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 2.9749%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios16.relu, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 2.4627%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.conv, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 0.6614%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.conv, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 3.0671%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios16.relu, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 2.4627%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.conv, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 0.6614%
2024-06-13 16:59:10.785 coreml_profiling[67391:1771143] operation ios17.add, device <MLNeuralEngineComputeDevice: 0x6000036ec0e0>, cost 1.8685%
....
Notes:
- add
-sfor synchronous
get profiling results without using MLComputePlan
It turns out it's pretty straightforward to get profiling results without use MLComputePlan related classes and methods. Some reverse-engineering showed that with some undocumented methods, we can get more detailed information as in without_compute_plan_output.txt The example program is here
take 2
With 85b7501 https://github.com/freedomtan/coreml_modelc_profling/commit/85b7501d4b2226d61c7b66aed5bb457f79d8d720, it prints outputs to stdout instead of NSLog to stderr.
https://github.com/freedomtan/coreml_modelc_profling/blob/85b7501d4b2226d61c7b66aed5bb457f79d8d720/without_compute_plan_2.csv#L1-L12