From 3ab9d9d2756474d20f3a0d3b9f6d10d28f28fb05 Mon Sep 17 00:00:00 2001 From: Hashim Sharif <hsharif3@miranda.cs.illinois.edu> Date: Sat, 27 Mar 2021 21:35:22 -0500 Subject: [PATCH] Starting with Configuration format discussion --- .../developerdocs/configuration-format.rst | 30 +++++++++++++++++++ hpvm/docs/developerdocs/index.rst | 1 + 2 files changed, 31 insertions(+) create mode 100644 hpvm/docs/developerdocs/configuration-format.rst diff --git a/hpvm/docs/developerdocs/configuration-format.rst b/hpvm/docs/developerdocs/configuration-format.rst new file mode 100644 index 0000000000..53cc91dc0c --- /dev/null +++ b/hpvm/docs/developerdocs/configuration-format.rst @@ -0,0 +1,30 @@ + +Approximation Configuration Format +================================== + +The HPVM binaries generated from the (Keras and PyTorch) Frontends support loading in a configuration file (`HPVM_binary -c ${config_file_path}`) that loads approximation knobs corresponding to each tensor operation in the program. This configuration file is the output of the autotuner (`predtuner`) that selects an approximation knob for each tensor operation, while respecting the accuracy degradation budget given for autotuning. The HPVM tensor runtime uses the configuration to dispatch to the corresponding approximate variants with appropriate arguments. + + +The format of the configuration is includes one line per fused HPVM node. Note that this often includes multiple Tensor operations in a single fused node. For instance, a Convolution, Add, and Relu, are fused into a single HPVM node since these are semantically a convolution layer. This fusion is done to facilitate code generation to accelerators and libraries that expose higher level abstractions such as "Convolution Layers" or "Dense Layer" as the API. + +File Format +-------------- + +`+++++` +`${config_id} ${predicted_speedup} $predicted_energy} ${real_accuracy} ${accuracy_degration}` +`${hpvm_node_id} ${device=cpu|gpu} ${tensor_op_type} ${approximation_knob} ....` +`${hpvm_node_id} .....` +`-----` + +The delimeters `+++++` and `-----` marked beginning and end of a configuration + +The `$config_id` is the configuration ID in the configuration file. A configuration file is a list of multiple configurations - the runtime can select from any of these at runtime - default behavior is to use the first configuration in the file. + +`$predicted_speedup` is the "hardware-agnostic" speedup predicted by the autotuner using a performance heursitic (no performance measurement on a hardware device) + +`$predicted_energy`: hardware-agnostic predicted energy metric. Currently, the tuner sets this to 0 - since we do not yet support energy estimation. + +`$real_accuracy` is the accuracy of the program on the tune set (inputs used for tuning) when no approximations are applied and `$accuracy_degradation` is the drop in accuracy when applying the configuration that follows - the specific knob settings that follow. + +`$hpvm_node_id` specifies the node ID to apply the approximation knobs for, `$device` specifies the device to offload to, `${tensor_op_type}` specifies the type of tensor operation (conv, mul, add, relu etc.), and `approximation_knob` is the knob setting corresponding to this tensor operation. The autotuner selects these knobs. + diff --git a/hpvm/docs/developerdocs/index.rst b/hpvm/docs/developerdocs/index.rst index 77083aa7d1..225fbdfd49 100644 --- a/hpvm/docs/developerdocs/index.rst +++ b/hpvm/docs/developerdocs/index.rst @@ -6,3 +6,4 @@ Developer Documents approximation-implementation cnn-models + configuration-format -- GitLab