Commits · ce3528e47c7271908812658e35db7e7afc5ddc6c · llvm / distiller · GitLab

Snippets Groups Projects

Aug 07, 2019
- [Quantizer] Fix handling when default bits_activations == None (#345) · ce3528e4
  Guy Jacob authored 5 years ago
  
  Unverified
  
  ce3528e4
Jul 23, 2019
- Save dummy_input in quantizer metadata (#333) · 6782ccae
  Lev Zlotnik authored 5 years ago
  
  And use it when calling prepare_model when loading from a checkpoint
  6782ccae
Jul 10, 2019

Post-Train Quantization: BN folding and "net-aware quantization" (#313) · 43548deb

Guy Jacob authored 5 years ago

* "Net-aware quantization" - using the term coined in
  https://arxiv.org/abs/1811.09886. (section 3.2.2).
  Refers to considering sequences of modules when quantizing. This 
  isn't exactly layer fusion - we modify activation stats prior to
  setting quantization parameters, to make sure that when a module
  is followed by certain activation functions, only the relevant
  ranges are quantized. We do this for:
    * ReLU - Clip all negative values
    * Tanh / Sigmoid - Clip according to the (approximated) saturation
      values for these functions. We use [-4, 4] for tanh and [-6, 6]
      for sigmoid.

* Perform batch-norm folding before post-training quantization.
  Batch-norm parameters are folded into the parameters of the previous
  layer and the BN layer is replaced with an identity module.

* Both BN folding and "net-aware" are now automatically executed
  in PostTrainLinearQuantizer (details of this change below)

* BN folding enabled by new generic mechanism to "fuse" module
  sequences (at the Python API level)
    * First module in sequence is replaced/modified by a user-provided
      function, rest of moudles replaced with nn.Identity

* Quantizer changes:
  * Optionally create adjacency map during prepare_model
  * Subclasses may enforce adjacency map creation
  * Refatcoring: Replace _prepare_model_impl with pre and post
    override-able "callbacks", so core functionality is always executed

* PostTrainLinearQuantizer Changes:
  * Enforce creation of adjacency map. This means users must now pass a
    dummy input to PostTrainLinearQuantizer.prepare_model
  * Before module replacement - Apply BN folding and stats updates according
    to net-aware quantization

* Updated the language model quantization tutorial to reflect the new
  functionality

* Updated the image classification post-train quantization samples
  (command line and YAML)

* Other changes:
  * Distller LSTM implementation:
    Replace the ModuleList for cells with a plain list. The PyTorch trace
    mechanism doesn't "see" ModuleList objects, it only sees the 
    contained modules. This means that the "scopeName" of these modules
    isn't complete, which makes it impossible to match op names in 
    SummaryGraph to modules in the Python model.
  * ActivationStatsCollector: Ignore nn.Identity modules

43548deb

May 27, 2019

Bug fix for shared module (#268) · d6efbe40

Lev Zlotnik authored 5 years ago

* Fixed bug where a shared module which was supposed to be skipped wasn't skipped on the second reference

* Added tests for new bug fix

d6efbe40

May 20, 2019
- Fix type hint usage which breaks on Python 3.5 · a7635ea1
  Guy Jacob authored 5 years ago
  
  a7635ea1
- Fixed KeyError treatment in Quantizer._pre_process_container (#263) · 25d7a7b7
  Lev Zlotnik authored 5 years ago
  
  * Made quantizer.replacemet_factory as defaultdict and removed 'except KeyError' in pre_process_container * Added explanations and type hint for replace_fn
  Unverified
  
  25d7a7b7
May 19, 2019
- Bugfix in bias handling in quant-aware training (fixes issue #248) · 4c163690
  Guy Jacob authored 5 years ago
  
  4c163690
May 02, 2019
- Quantizer: Proper handling of modules that point to same object (#239) · a69dd5d6
  Lev Zlotnik authored 5 years ago
  
  a69dd5d6
Apr 01, 2019

Quantizer: Specify # bias bits + custom overrides (BREAKING) (#178) · 5271625a

Lev Zlotnik authored 6 years ago

* Bias handling:
  * Add 'bits_bias' parameter to explicitly specify # of bits for bias,
    similar to weights and activations.
  * BREAKING: Remove the now redundant 'quantize_bias' boolean parameter
* Custom overrides:
  * Expand the semantics of the overrides dict to allow overriding of
    other parameters in addition to bit-widths
  * Functions registered in the quantizer's 'replacement_factory' can
    define keyword arguments. Non bit-width entries in the overrides
    dict will be checked against the function signature and passed
  * BREAKING:
    * Changed the name of 'bits_overrides' to simply 'overrides'
    * Bit-width overrides must now be defined using the full parameter
      names - 'bits_activations/weights/bias' instead of the short-hands
      'acts' and 'wts' which were used so far.
  * Added/updated relevant tests
  * Modified all quantization YAMLs under 'examples' to reflect 
    these changes
  * Updated docs

5271625a

Feb 11, 2019

Post-train quant based on stats + additional modules quantized (#136) · 28a8ee18

Guy Jacob authored 6 years ago

Summary of changes:
(1) Post-train quantization based on pre-collected statistics
(2) Quantized concat, element-wise addition / multiplication and embeddings
(3) Move post-train quantization command line args out of sample code
(4) Configure post-train quantization from YAML for more fine-grained control

(See PR #136 for more detailed changes descriptions)

28a8ee18

Jan 23, 2019
- Quant-aware training: Quantize bias to 32 bits (Hard-coded for now) · c98df541
  Guy Jacob authored 6 years ago
  
  c98df541
Dec 04, 2018

Range-Based Linear Quantization Features (#95) · 907a6f04

Guy Jacob authored 6 years ago

* Asymmetric post-training quantization (only symmetric supported so until now)
* Quantization aware training for range-based (min-max) symmetric and asymmetric quantization
* Per-channel quantization support in both training and post-training
* Added tests and examples
* Updated documentation

907a6f04

Jul 22, 2018

PACT quantizer (#30) · df9a00ce

Gal Novik authored 6 years ago

* Adding PACT quantization method
* Move logic modifying the optimizer due to changes the quantizer makes into the Quantizer itself
* Updated documentation and tests

df9a00ce

Jul 17, 2018

Quantizer tests, fixes and docs update · 6b166cec

Guy Jacob authored 6 years ago

* Add Quantizer unit tests
* Require 'bits_overrides' to be OrderedDict to support overlapping
  patterns in a predictable manner + update documentation to reflect this
* Quantizer class cleanup
  * Use "public" nn.Module APIs instead of protected attributes
  * Call the builtins set/get/delattr instead of the class special methods
    (__***__)
  * Fix issues reported in #24
* Bug in RangeLinearQuantParamLayerWrapper - add explicit override of
  pre_quantized_forward accpeting single input (#15)
* Add DoReFa test to full_flow_tests

6b166cec

Jul 05, 2018
- Allow quantize_bias to work in DorefaQuantizer and checkpoints (#17) · 2bb9689f
  Robert Muchsel authored 6 years ago
  
  2bb9689f
Jun 21, 2018
- Training with quantization (#8) · 5bb9e138
  Guy Jacob authored 6 years ago
  
  Unverified
  
  5bb9e138
Apr 24, 2018
- first commit · 6eef69b5
  Neta Zmora authored 7 years ago
  
  6eef69b5