Skip to content
Snippets Groups Projects
  • Guy Jacob's avatar
    9e7ef987
    Post-Train Quant: Greedy Search + Proper mixed-settings handling (#402) · 9e7ef987
    Guy Jacob authored
    
    * Greedy search script for post-training quantization settings
      * Iterates over each layer in the model in order. For each layer,
        checks a user-defined set of quantization settings and chooses
        the best one based on validation accuracy
      * Provided sample that searches for best activations-clipping
        mode per layer, on image classification models
    
    * Proper handling of mixed-quantization settings in post-train quant:
      * By default, the quantization settings for each layer apply only
        to output quantization
      * Propagate quantization settings for activations tensors through
        the model during execution
      * For non-quantized inputs to layers that require quantized inputs,
        fall-back to quantizing according to the settings used for the
        output
      * In addition, provide mechanism to override inputs quantization
        settings via the YAML configuration file
      * By default all modules are quantized now. For module types that
        don't have a dedicated quantized implementation, "fake"
        quantization is performed
    
    * Misc. Changes
      * Fuse ReLU/ReLU6 to predecessor during post-training quantization
      * Fixes to ACIQ clipping in the half-range case
    
    Co-authored-by: default avatarLev Zlotnik <lev.zlotnik@intel.com>
    Co-authored-by: default avatarGuy Jacob <guy.jacob@intel.com>
    Post-Train Quant: Greedy Search + Proper mixed-settings handling (#402)
    Guy Jacob authored
    
    * Greedy search script for post-training quantization settings
      * Iterates over each layer in the model in order. For each layer,
        checks a user-defined set of quantization settings and chooses
        the best one based on validation accuracy
      * Provided sample that searches for best activations-clipping
        mode per layer, on image classification models
    
    * Proper handling of mixed-quantization settings in post-train quant:
      * By default, the quantization settings for each layer apply only
        to output quantization
      * Propagate quantization settings for activations tensors through
        the model during execution
      * For non-quantized inputs to layers that require quantized inputs,
        fall-back to quantizing according to the settings used for the
        output
      * In addition, provide mechanism to override inputs quantization
        settings via the YAML configuration file
      * By default all modules are quantized now. For module types that
        don't have a dedicated quantized implementation, "fake"
        quantization is performed
    
    * Misc. Changes
      * Fuse ReLU/ReLU6 to predecessor during post-training quantization
      * Fixes to ACIQ clipping in the half-range case
    
    Co-authored-by: default avatarLev Zlotnik <lev.zlotnik@intel.com>
    Co-authored-by: default avatarGuy Jacob <guy.jacob@intel.com>