Skip to content
Snippets Groups Projects
  1. Aug 07, 2019
  2. Jul 10, 2019
    • Guy Jacob's avatar
      Update post-train quant command line example · 112163eb
      Guy Jacob authored
      Unverified
      112163eb
    • Guy Jacob's avatar
      Post-Train Quantization: BN folding and "net-aware quantization" (#313) · 43548deb
      Guy Jacob authored
      * "Net-aware quantization" - using the term coined in
        https://arxiv.org/abs/1811.09886. (section 3.2.2).
        Refers to considering sequences of modules when quantizing. This 
        isn't exactly layer fusion - we modify activation stats prior to
        setting quantization parameters, to make sure that when a module
        is followed by certain activation functions, only the relevant
        ranges are quantized. We do this for:
          * ReLU - Clip all negative values
          * Tanh / Sigmoid - Clip according to the (approximated) saturation
            values for these functions. We use [-4, 4] for tanh and [-6, 6]
            for sigmoid.
      
      * Perform batch-norm folding before post-training quantization.
        Batch-norm parameters are folded into the parameters of the previous
        layer and the BN layer is replaced with an identity module.
      
      * Both BN folding and "net-aware" are now automatically executed
        in PostTrainLinearQuantizer (details of this change below)
      
      * BN folding enabled by new generic mechanism to "fuse" module
        sequences (at the Python API level)
          * First module in sequence is replaced/modified by a user-provided
            function, rest of moudles replaced with nn.Identity
      
      * Quantizer changes:
        * Optionally create adjacency map during prepare_model
        * Subclasses may enforce adjacency map creation
        * Refatcoring: Replace _prepare_model_impl with pre and post
          override-able "callbacks", so core functionality is always executed
      
      * PostTrainLinearQuantizer Changes:
        * Enforce creation of adjacency map. This means users must now pass a
          dummy input to PostTrainLinearQuantizer.prepare_model
        * Before module replacement - Apply BN folding and stats updates according
          to net-aware quantization
      
      * Updated the language model quantization tutorial to reflect the new
        functionality
      
      * Updated the image classification post-train quantization samples
        (command line and YAML)
      
      * Other changes:
        * Distller LSTM implementation:
          Replace the ModuleList for cells with a plain list. The PyTorch trace
          mechanism doesn't "see" ModuleList objects, it only sees the 
          contained modules. This means that the "scopeName" of these modules
          isn't complete, which makes it impossible to match op names in 
          SummaryGraph to modules in the Python model.
        * ActivationStatsCollector: Ignore nn.Identity modules
      Unverified
      43548deb
  3. Jun 03, 2019
    • Lev Zlotnik's avatar
      [Breaking] PTQ: Removed special handling of clipping overrides · 3cde6c5e
      Lev Zlotnik authored
      * In PostTrainLinearQuantizer - moved 'clip_acts' and 'clip_n_stds'
        to overrides, removed 'no_clip_layers' parameter from __init__
      * The 'no_clip_layers' command line argument REMAINS, handled in 
        PostTrainLinearQuantizer.from_args()
      * Removed old code from comments, fixed warnings in 
        test_post_train_quant.py
      * Updated tests
      * Update post-train quant sample YAML
      3cde6c5e
  4. May 19, 2019
  5. Apr 14, 2019
    • Guy Jacob's avatar
      Post-train quant: Extend acts clipping functionality (#225) · 437e270b
      Guy Jacob authored
      * Some refactoring to enable multiple clipping methods
      * BREAKING: clip_acts as a boolean flag (either in command line
        or in function signature) will fail. Error message with valid
        values from is displayed.
      * Implemented clipping activations at mean + N * std
        (N is user configurable)
      * Additional tests
      * Updated docs
      Unverified
      437e270b
  6. Apr 01, 2019
    • Lev Zlotnik's avatar
      Quantizer: Specify # bias bits + custom overrides (BREAKING) (#178) · 5271625a
      Lev Zlotnik authored
      * Bias handling:
        * Add 'bits_bias' parameter to explicitly specify # of bits for bias,
          similar to weights and activations.
        * BREAKING: Remove the now redundant 'quantize_bias' boolean parameter
      * Custom overrides:
        * Expand the semantics of the overrides dict to allow overriding of
          other parameters in addition to bit-widths
        * Functions registered in the quantizer's 'replacement_factory' can
          define keyword arguments. Non bit-width entries in the overrides
          dict will be checked against the function signature and passed
        * BREAKING:
          * Changed the name of 'bits_overrides' to simply 'overrides'
          * Bit-width overrides must now be defined using the full parameter
            names - 'bits_activations/weights/bias' instead of the short-hands
            'acts' and 'wts' which were used so far.
        * Added/updated relevant tests
        * Modified all quantization YAMLs under 'examples' to reflect 
          these changes
        * Updated docs
      5271625a
  7. Mar 27, 2019
  8. Feb 11, 2019
    • Guy Jacob's avatar
      Post-train quant based on stats + additional modules quantized (#136) · 28a8ee18
      Guy Jacob authored
      Summary of changes:
      (1) Post-train quantization based on pre-collected statistics
      (2) Quantized concat, element-wise addition / multiplication and embeddings
      (3) Move post-train quantization command line args out of sample code
      (4) Configure post-train quantization from YAML for more fine-grained control
      
      (See PR #136 for more detailed changes descriptions)
      Unverified
      28a8ee18
  9. Dec 04, 2018
    • Guy Jacob's avatar
      Range-Based Linear Quantization Features (#95) · 907a6f04
      Guy Jacob authored
      * Asymmetric post-training quantization (only symmetric supported so until now)
      * Quantization aware training for range-based (min-max) symmetric and asymmetric quantization
      * Per-channel quantization support in both training and post-training
      * Added tests and examples
      * Updated documentation
      Unverified
      907a6f04
  10. Dec 02, 2018
  11. Jul 22, 2018
    • Gal Novik's avatar
      PACT quantizer (#30) · df9a00ce
      Gal Novik authored
      * Adding PACT quantization method
      * Move logic modifying the optimizer due to changes the quantizer makes into the Quantizer itself
      * Updated documentation and tests
      df9a00ce
  12. Jun 21, 2018
Loading