Unverified Commit 9e7ef987 authored 5 years ago by Guy Jacob Committed by GitHub 5 years ago

Post-Train Quant: Greedy Search + Proper mixed-settings handling (#402)


* Greedy search script for post-training quantization settings
  * Iterates over each layer in the model in order. For each layer,
    checks a user-defined set of quantization settings and chooses
    the best one based on validation accuracy
  * Provided sample that searches for best activations-clipping
    mode per layer, on image classification models

* Proper handling of mixed-quantization settings in post-train quant:
  * By default, the quantization settings for each layer apply only
    to output quantization
  * Propagate quantization settings for activations tensors through
    the model during execution
  * For non-quantized inputs to layers that require quantized inputs,
    fall-back to quantizing according to the settings used for the
    output
  * In addition, provide mechanism to override inputs quantization
    settings via the YAML configuration file
  * By default all modules are quantized now. For module types that
    don't have a dedicated quantized implementation, "fake"
    quantization is performed

* Misc. Changes
  * Fuse ReLU/ReLU6 to predecessor during post-training quantization
  * Fixes to ACIQ clipping in the half-range case

Co-authored-by: Lev Zlotnik <lev.zlotnik@intel.com>
Co-authored-by: Guy Jacob <guy.jacob@intel.com>

parent 738d57f4

No related branches found

No related tags found

No related merge requests found

Expand all Hide whitespace changes

Inline Side-by-side

Showing with 1370 additions and 372 deletions

Please register or to comment