ACL Execution Provider

The ACL Execution Provider enables accelerated performance on Arm®-based CPUs through Arm Compute Library.

Build

For build instructions, please see the build page.

Usage

C/C++

Ort::Env env = Ort::Env{ORT_LOGGING_LEVEL_ERROR, "Default"};
Ort::SessionOptions sf;
bool enable_fast_math = true;
Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_ACL(sf, enable_fast_math));

The C API details are here.

Python

import onnxruntime

providers = [("ACLExecutionProvider", {"enable_fast_math": "true"})]
sess = onnxruntime.InferenceSession("model.onnx", providers=providers)

Performance Tuning

Arm Compute Library has a fast math mode that can increase performance with some potential decrease in accuracy for MatMul and Conv operators. It is disabled by default.

When using onnxruntime_perf_test, use the flag -e acl to enable the ACL Execution Provider. You can additionally use -i 'enable_fast_math|true' to enable fast math.

Arm Compute Library uses the ONNX Runtime intra-operator thread pool when running via the execution provider. You can control the size of this thread pool using the -x option.

Supported Operators

Operator Supported types
AveragePool float
BatchNormalization float
Concat float
Conv float, float16
FusedConv float
FusedMatMul float, float16
Gemm float
GlobalAveragePool float
GlobalMaxPool float
MatMul float, float16
MatMulIntegerToFloat uint8, int8, uint8+int8
MaxPool float
NhwcConv float
Relu float
QLinearConv uint8, int8, uint8+int8