- Aug 07, 2019
-
-
Guy Jacob authored
-
- Jul 23, 2019
-
-
Lev Zlotnik authored
And use it when calling prepare_model when loading from a checkpoint
-
- Jul 10, 2019
-
-
Guy Jacob authored
* "Net-aware quantization" - using the term coined in https://arxiv.org/abs/1811.09886. (section 3.2.2). Refers to considering sequences of modules when quantizing. This isn't exactly layer fusion - we modify activation stats prior to setting quantization parameters, to make sure that when a module is followed by certain activation functions, only the relevant ranges are quantized. We do this for: * ReLU - Clip all negative values * Tanh / Sigmoid - Clip according to the (approximated) saturation values for these functions. We use [-4, 4] for tanh and [-6, 6] for sigmoid. * Perform batch-norm folding before post-training quantization. Batch-norm parameters are folded into the parameters of the previous layer and the BN layer is replaced with an identity module. * Both BN folding and "net-aware" are now automatically executed in PostTrainLinearQuantizer (details of this change below) * BN folding enabled by new generic mechanism to "fuse" module sequences (at the Python API level) * First module in sequence is replaced/modified by a user-provided function, rest of moudles replaced with nn.Identity * Quantizer changes: * Optionally create adjacency map during prepare_model * Subclasses may enforce adjacency map creation * Refatcoring: Replace _prepare_model_impl with pre and post override-able "callbacks", so core functionality is always executed * PostTrainLinearQuantizer Changes: * Enforce creation of adjacency map. This means users must now pass a dummy input to PostTrainLinearQuantizer.prepare_model * Before module replacement - Apply BN folding and stats updates according to net-aware quantization * Updated the language model quantization tutorial to reflect the new functionality * Updated the image classification post-train quantization samples (command line and YAML) * Other changes: * Distller LSTM implementation: Replace the ModuleList for cells with a plain list. The PyTorch trace mechanism doesn't "see" ModuleList objects, it only sees the contained modules. This means that the "scopeName" of these modules isn't complete, which makes it impossible to match op names in SummaryGraph to modules in the Python model. * ActivationStatsCollector: Ignore nn.Identity modules
-
- May 27, 2019
-
-
Lev Zlotnik authored
* Fixed bug where a shared module which was supposed to be skipped wasn't skipped on the second reference * Added tests for new bug fix
-
- May 20, 2019
-
-
Guy Jacob authored
-
Lev Zlotnik authored
* Made quantizer.replacemet_factory as defaultdict and removed 'except KeyError' in pre_process_container * Added explanations and type hint for replace_fn
-
- May 19, 2019
-
-
Guy Jacob authored
-
- May 02, 2019
-
-
Lev Zlotnik authored
-
- Apr 01, 2019
-
-
Lev Zlotnik authored
* Bias handling: * Add 'bits_bias' parameter to explicitly specify # of bits for bias, similar to weights and activations. * BREAKING: Remove the now redundant 'quantize_bias' boolean parameter * Custom overrides: * Expand the semantics of the overrides dict to allow overriding of other parameters in addition to bit-widths * Functions registered in the quantizer's 'replacement_factory' can define keyword arguments. Non bit-width entries in the overrides dict will be checked against the function signature and passed * BREAKING: * Changed the name of 'bits_overrides' to simply 'overrides' * Bit-width overrides must now be defined using the full parameter names - 'bits_activations/weights/bias' instead of the short-hands 'acts' and 'wts' which were used so far. * Added/updated relevant tests * Modified all quantization YAMLs under 'examples' to reflect these changes * Updated docs
-
- Feb 11, 2019
-
-
Guy Jacob authored
Summary of changes: (1) Post-train quantization based on pre-collected statistics (2) Quantized concat, element-wise addition / multiplication and embeddings (3) Move post-train quantization command line args out of sample code (4) Configure post-train quantization from YAML for more fine-grained control (See PR #136 for more detailed changes descriptions)
-
- Jan 23, 2019
-
-
Guy Jacob authored
-
- Dec 04, 2018
-
-
Guy Jacob authored
* Asymmetric post-training quantization (only symmetric supported so until now) * Quantization aware training for range-based (min-max) symmetric and asymmetric quantization * Per-channel quantization support in both training and post-training * Added tests and examples * Updated documentation
-
- Jul 22, 2018
-
-
Gal Novik authored
* Adding PACT quantization method * Move logic modifying the optimizer due to changes the quantizer makes into the Quantizer itself * Updated documentation and tests
-
- Jul 17, 2018
-
-
Guy Jacob authored
* Add Quantizer unit tests * Require 'bits_overrides' to be OrderedDict to support overlapping patterns in a predictable manner + update documentation to reflect this * Quantizer class cleanup * Use "public" nn.Module APIs instead of protected attributes * Call the builtins set/get/delattr instead of the class special methods (__***__) * Fix issues reported in #24 * Bug in RangeLinearQuantParamLayerWrapper - add explicit override of pre_quantized_forward accpeting single input (#15) * Add DoReFa test to full_flow_tests
-
- Jul 05, 2018
-
-
Robert Muchsel authored
-
- Jun 21, 2018
-
-
Guy Jacob authored
-
- Apr 24, 2018
-
-
Neta Zmora authored
-