A JSON object describing a Execution Providers has following syntax:

"providers": [
  {
    "name": "CUDA",
    "options": {
      "device_id": "0",
      "do_copy_in_default_stream": "false",
      "cudnn_conv_use_max_workspace": "1"
    }
  },
  {
    "name": "TensorRT",
    "options": {
      "device_id": "0"
    }
  }
]

Execution Providers

The Onnx Engine plug-in supports plenty execution providers to accelerate the computing process.

Available execution provider list:

More supported execution providers and options can be found in Onnx website.

Execution Provider Options

All the options and the option values must be defined as string values. The option values might describe other types as boolean or integer but still must be written as string value.

For example the option value of "device_id" for CUDA, the type is integer but must be written as string. If the value of the id is 0 must be written as string "0".

CUDA options

Bellow can be found the list of the options for CUDA:

Key Options	type	Value Options
device_id	int	Default: 0
user_compute_stream
do_copy_in_default_stream	bool	Default: True, True/False
use_ep_level_unified_stream	bool	Default: False, True/False
gpu_mem_limit	int	Default: max
arena_extend_strategy	int	Default: kNextPowerOfTwo = 0 ,kSameAsRequested = 1
cudnn_conv_algo_search	int	Default: EXHAUSTIVE = 0 ,HEURISTIC = 1 ,DEFAULT = 2
cudnn_conv_use_max_workspace	int	Default: 1, 0 = false, nonzero = true
cudnn_conv1d_pad_to_nc1d	int	Default: 0, 0 = false, nonzero = true
enable_cuda_graph	int	Default: 0, 0 = false, nonzero = true
enable_skip_layer_norm_strict_mode	int	Default: 0, 0 = false, nonzero = true
use_tf32	int	Default: 1, 0 = false, nonzero = true
gpu_external_[alloc\|free\|empty_cache]	int	Default: 0, 0 = false, nonzero = true
prefer_nhwc	int	Default: 0, 0 = false, nonzero = true

TensorRT options

Bellow can be found the list of the options for TensorRT:

Key Options	type	Value Options
device_id	int	Default: 0
user_compute_stream	string	Set custom compute stream for GPU operations.
trt_engine_cache_enable	bool	Default: 0 = false, nonzero = true
trt_engine_cache_path	string	Set path to store cached TensorRT engines.
trt_engine_cache_prefix	string	Set prefix for cached engine files.
trt_engine_hw_compatible	bool	Maximize engine compatibility across Ampere+ GPUs, True/False
trt_max_workspace_size	int	Default: 1073741824. (1GB) maximum workspace size for TensorRT.
trt_fp16_enable	bool	Enable TensorRT FP16 precision. True/False
trt_int8_enable	bool	Enable TensorRT INT8 precision. True/False
trt_int8_calibration_table_name	string	Enable TensorRT INT8 calibration table name.
trt_int8_use_native_calibration_table	bool	Use native TensorRT generated calibration table, True/False
trt_build_heuristics_enable	bool	Build engine using heuristics to reduce build time, True/False
trt_sparsity_enable	bool	Control if sparsity can be used by TRT, True/False
trt_dla_enable	bool	Default False, Enable DLA (Deep Learning Accelerator), True/False
trt_dla_core	int	Default: 0, Specify DLA core to execute on.
trt_max_partition_iterations	int	Default: 1000 ,maximum iterations for TensorRT parser to get capability.
trt_min_subgraph_size	int	Default: 1 ,minimum size of TensorRT subgraphs.
trt_dump_subgraphs	bool	Dump optimized subgraphs for debugging, True/False
trt_force_sequential_engine_build	bool	Force sequential engine builds under multi-GPU, True/False
trt_context_memory_sharing_enable	bool	Share execution context memory between TensorRT subgraph, True/False
trt_layer_norm_fp32_fallback	bool	Force layer norm calculations to FP32, True/False
trt_cuda_graph_enable	bool	Capture CUDA graph for reduced launch overhead, True/False
trt_builder_optimization_level	int	Default : 3 , valid range [0-5],Set optimization level for TensorRT builder.
trt_auxiliary_streams	int	Default :-1 ,Set number of auxiliary streams for computation.
trt_tactic_sources	string	Specify tactics sources for TensorRT. Example: "-CUDNN,+CUBLAS" available keys: "CUBLAS", "CUBLAS_LT", "CUDNN" or "EDGE_MASK_CONVOLUTIONS".
trt_extra_plugin_lib_paths	string	Add additional plug-in library paths for TensorRT.
trt_detailed_build_log	bool	Enable detailed logging of build steps , True/False
trt_timing_cache_enable	bool	Enable use of timing cache to speed up builds, True/False
trt_timing_cache_path	string	Set path for storing timing cache.
trt_force_timing_cache	bool	Force use of timing cache regardless of GPU match, True/False
trt_profile_min_shapes	string
trt_profile_max_shapes	string
trt_profile_opt_shapes	string

Huawei CANN options

Bellow can be found the list of the options for CANN:

Key Options	type	Value Options
device_id	int	Default: 0
npu_mem_limit	int	Default: max
arena_extend_strategy	int	Default: kNextPowerOfTwo = 0 ,kSameAsRequested = 1
enable_cann_graph	bool	Default: true, Whether to use the graph inference engine to speed up performance, True/False
dump_graphs	bool	Default: false, Whether to dump the subgraph into onnx format for analysis of subgraph segmentation, True/False
dump_om_model	bool	Default: true, Whether to dump the off line model for Ascend AI Processor to an .om file, True/False
precision_mode	string	Default: force_fp16, force_fp32/cube_fp16in_fp32out, force_fp16, allow_fp32_to_fp16, must_keep_origin_dtype, allow_mix_precision/allow_mix_precision_fp16
op_select_impl_mode	string	Default: high_performance, high_precision
optypelist_for_implmode	string	Default: None , Pooling, SoftmaxV2, LRN, ROIAlign

MIGraphX options

Bellow can be found the list of the options for MIGraphX:

Key Options	type	Value Options
device_id	int	Default: 0
migraphx_int8_enable	int	Default: 0 = false, nonzero = true, MIGraphX INT8 precision.
migraphx_fp16_enable	int	Default: 0 = false, nonzero = true, MIGraphX FP16 precision.
migraphx_use_native_calibration_table	int	Default: 0 = false, noznero = true, MIGraphx INT8 cal table .
migraphx_int8_calibration_table_name	string	MIGraphx INT8 calibration table name.

ROCm options

Bellow can be found the list of the options for ROCm:

Key Options	type	Value Options
device_id	int	Default: 0
arena_extend_strategy	int	Default: kNextPowerOfTwo = 0 ,kSameAsRequested = 1
do_copy_in_default_stream	int	Default: true, 0 = false, nonzero = true
gpu_mem_limit	int	Default: max
has_user_compute_stream
miopen_conv_exhaustive_search
tunable_op_enable	int	Default: true, 0 = false, nonzero = trueSet to use TunableOp.
enable_hip_graph	int

OpenVINO options

Bellow can be found the list of the options for OpenVINO:

Key Options	type	Value Options
device_id	int	Default: 0
cache_dir	string	Any valid string path on the hardware target
device_type	string	CPU, NPU, GPU, GPU.0, GPU.1 based on the available GPUs, NPU, Any valid Hetero combination, Any valid Multi or Auto devices combination.
enable_dynamic_shapes	bool	True/False
enable_opencl_throttling	bool	True/False
enable_npu_fast_compile	bool	True/False
num_of_threads	int	Any unsigned positive number other than 0.

Direct ML options

Bellow can be found the list of the options for Direct ML

Key Options	type	Value Options
device_id	int	Default value: 0

Object Members

name	A string type containing the name of the execution provider.
options	An optional JSON object containing a pair of String values. First string value describes the option an the second string value defines the option value.

Please remember that each element block requires two layers of {} due to the syntax restrictions of the JSON format.