-
Guy Jacob authored
* Greedy search script for post-training quantization settings * Iterates over each layer in the model in order. For each layer, checks a user-defined set of quantization settings and chooses the best one based on validation accuracy * Provided sample that searches for best activations-clipping mode per layer, on image classification models * Proper handling of mixed-quantization settings in post-train quant: * By default, the quantization settings for each layer apply only to output quantization * Propagate quantization settings for activations tensors through the model during execution * For non-quantized inputs to layers that require quantized inputs, fall-back to quantizing according to the settings used for the output * In addition, provide mechanism to override inputs quantization settings via the YAML configuration file * By default all modules are quantized now. For module types that don't have a dedicated quantized implementation, "fake" quantization is performed * Misc. Changes * Fuse ReLU/ReLU6 to predecessor during post-training quantization * Fixes to ACIQ clipping in the half-range case Co-authored-by:
Lev Zlotnik <lev.zlotnik@intel.com> Co-authored-by:
Guy Jacob <guy.jacob@intel.com>
Guy Jacob authored* Greedy search script for post-training quantization settings * Iterates over each layer in the model in order. For each layer, checks a user-defined set of quantization settings and chooses the best one based on validation accuracy * Provided sample that searches for best activations-clipping mode per layer, on image classification models * Proper handling of mixed-quantization settings in post-train quant: * By default, the quantization settings for each layer apply only to output quantization * Propagate quantization settings for activations tensors through the model during execution * For non-quantized inputs to layers that require quantized inputs, fall-back to quantizing according to the settings used for the output * In addition, provide mechanism to override inputs quantization settings via the YAML configuration file * By default all modules are quantized now. For module types that don't have a dedicated quantized implementation, "fake" quantization is performed * Misc. Changes * Fuse ReLU/ReLU6 to predecessor during post-training quantization * Fixes to ACIQ clipping in the half-range case Co-authored-by:
Lev Zlotnik <lev.zlotnik@intel.com> Co-authored-by:
Guy Jacob <guy.jacob@intel.com>