A JSON object describing a Session Options has following syntax:

"sessionOptions": {
  "session.dynamic_block_base": "4"
}

Session Options

reference: options

Key Option	type	Value Options
session.disable_prepacking	int	Default: "0" = prepacking is enabled, "1" = prepacking is disabled.
session.use_env_allocators	int	A value of "1" means allocators registered in the env will be used. "0" means the allocators created in the session will be used. Use this to override the usage of env allocators on a per session level.
session.load_model_format	string	If unset, model type will default to ONNX unless inferred from filename ('.ort' == ORT format) or bytes to be ORT.
session.save_model_format	string	If unset, format will default to ONNX unless optimized_model_filepath ends in '.ort'. Set to 'ORT' (case sensitive) to save optimized model in ORT format when SessionOptions.optimized_model_path is set.
session.set_denormal_as_zero	int	If a value is "1", flush-to-zero and denormal-as-zero are applied. The default is "0"
session.disable_quant_qdq	int	Default: "0" unless the DirectML execution provider is registered, in which case it defaults to "1".It controls to run quantization model in QDQ (QuantizelinearDeQuantizelinear) format or not.
session.disable_double_qdq_remover	int	Default: "0": not to disable. ORT does remove the middle 2 Nodes from a Q->(QD->Q)->QD pairs. "1": disable. ORT doesn't remove the middle 2 Nodes from a Q->(QD->Q)->QD pairs. t controls whether to enable Double QDQ remover and Identical Children Consolidation.
session.enable_quant_qdq_cleanup	int	Default: "0" = Disable, "1" = Enables the removal of QuantizeLinear/DequantizeLinear node pairs once all QDQ handling has been completed.
optimization.enable_gelu_approximation	int	Default: "0" = Disable, "1" = Enable gelu approximation in graph optimization.
session.disable_aot_function_inlining	int	Default: "0" = Disable, "1" = Enable AheadOfTime function inlining. AOT function inlining examines the graph and attempts to inline as many locally defined functions in the modelas possible with the help of enabled execution providers.
optimization.disable_specified_optimizers		Specifies the config for detecting subgraphs for memory footprint reduction. The value should be a string contains int separated using commas. The default value is "0:0".
session.use_device_allocator_for_initializers	int	Default: "0" = Disable. "1" Enable using device allocator for allocating initialized tensor memory.
session.inter_op.allow_spinning	int	Default: "1", thread will spin a number of times before blocking. "0": thread will block if found no job to run. Configure whether to allow the inter_op threads spinning a number of times before blocking.
session.intra_op.allow_spinning	int	Default: "1", thread will spin a number of times before blocking. "0": thread will block if found no job to run. Configure whether to allow the intra_op threads spinning a number of times before blocking.
session.use_ort_model_bytes_directly	int	Default: "0", copy the model bytes at the time of session creation to ensure the model bytes buffer is valid. "1" will disable copy the model bytes, and use the model bytes directly
session.use_ort_model_bytes_for_initializers		Key for using the ORT format model flatbuffer bytes directly for initializers. This avoids copying the bytes and reduces peak memory usage during model loading and initialization. Requires `session.use_ort_model_bytes_directly` to be true.
session.qdqisint8allowed	int	If the ORT format model will be used on ARM platforms set to "1". For other platforms set to "0". This should only be specified when exporting an ORT format model for use on a different platform.
session.x64quantprecision	string	x64 SSE4.1/AVX2/AVX512(with no VNNI) has overflow problem with quantizied matrix multiplication with U8S8. Only effective with AVX2 or AVX512 platforms.
optimization.minimal_build_optimizations	string	"save": Save runtime optimizations when saving an ORT format model. "apply": Only apply optimizations available in a minimal build.
ep.nnapi.partitioning_stop_ops		This option allows to decrease CPU usage between infrequent requests and forces any TP threads spinning stop immediately when the last of concurrent Run() call returns. Spinning is restarted on the next Run() call.
session.dynamic_block_base	int	The feature will not function by default, specify any positive integer. Enabling dynamic block-sizing for multithreading. With a positive value, thread pool will split a task of N iterations to blocks of size starting from N/(num_of_threads*dynamic_block_base).
session.force_spinning_stop		This option allows to decrease CPU usage between infrequent requests and forces any TP threads spinning stop immediately when the last of concurrent Run() call returns.
session.strict_shape_type_inference	int	Default: "0": in some cases warnings will be logged but processing will continue. "1": all inconsistencies encountered during shape and type inference will result in failures.
session.allow_released_opsets_only	int	"1": every model using a more recent opset than the latest released one will fail. "0": the model may or may not work if onnxruntime cannot find an implementation, this option is used for development purpose.
session.node_partition_config_file	string	The file saves configuration for partitioning node among logic streams.
session.intra_op_thread_affinities	string	This Option allows setting affinities for intra op threads. Affinity string follows format: logical_processor_id,logical_processor_id;logical_processor_id,logical_processor_id for example "1,2,3;4,5". An other example with specified intervals e.g. "1-8;8-16;17-24".
session.debug_layout_transformation	int	Default: "0" = Disable, "1" = Enable. This option will dump out the model to assist debugging any issues with layout transformation
session.disable_cpu_ep_fallback	int	Default: "0" = Disable, "1" = Enable. If this option is set to "1", session creation will fail if the execution providers other than the CPU EP cannot.
session.optimized_model_external_initializers_file_name		Use this config when serializing a large model after optimization to specify an external initializers file.
session.optimized_model_external_initializers_min_size_in_bytes		Use this config to control the minimum size of the initializer when externalizing it during serialization.
ep.context_enable	int	Default: "0" = Disable, "1" = Enable EP context feature to dump the partitioned graph which includes the EP context into Onnx file.
ep.context_file_path	string	Default: the original_file_name_ctx.onnx if not specified. Specify the file path for the Onnx model which has EP context.
ep.context_embed_mode	int	Default: "1" = Dump the EP context into the Onnx model, "0" = Dump the EP context into separate file, keep the file name in the Onnx model,Flag to specify whether to dump the EP context into the Onnx model.
mlas.enable_gemm_fastmath_arm64_bfloat16	int	Default: "0" = Disable, Enable = "1", Gemm fastmath mode provides fp32 gemm acceleration with bfloat16 based matmu.

Object Members

sessionOptions A JSON object containing a pair of String values. First string value describes the option an the second string value defines the option value.

Please remember that each element block requires two layers of {} due to the syntax restrictions of the JSON format.