Skip to content
Snippets Groups Projects
  1. Apr 12, 2020
  2. Mar 31, 2020
  3. Feb 26, 2020
  4. Feb 23, 2020
  5. Feb 17, 2020
  6. Feb 13, 2020
  7. Feb 09, 2020
  8. Feb 06, 2020
    • Guy Jacob's avatar
      Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f
      Guy Jacob authored
      Convert Distiller PTQ models to "native" PyTorch PTQ (#458)
      
      * New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
      * Can also be called from PostTrainLinearQuantizer instance:
          quantizer.convert_to_pytorch()
      * Can also trigger from command line in image classification sample
      * Can save/load converted modules via apputils.load/save_checkpoint
      * Added Jupyter notebook tutorial
      
      * Converted modules have only the absolutely necessary quant-dequant
        operations. For a fully quantized model, this means just quantization
        of model input and de-quantization of model output. If a user keeps
        specific internal layers in FP32, quant-dequant operations are added
        as needed
      * Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
        take care of preventing overflows (aka "reduce_range" in the PyTorch
        API)
      Unverified
      cdc1775f
  9. Feb 03, 2020
  10. Feb 02, 2020
    • Lev Zlotnik's avatar
      Loss Aware Post Train Quantization search (#432) · 0b493fd3
      Lev Zlotnik authored
      "Loss Aware Post-Training Quantization" (Nahshan et al., 2019)
      
      Paper: https://arxiv.org/abs/1911.07190 
      Reference implementation:
        https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq
      
      Proper documentation is still TODO, for now see the example YAML file
      at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'
      
      * Implemented in distiller/quantization/ptq_coordinate_search.py
      * At the moment that file both the model-independent algorithm
        implementation and image-classification specific sample script.
        Still TODO: Refactor that
      
      * Post train quantization changes (range_linear):
        * Added getters/setters for quantization parameters (scale/zero_point)
          and clipping values
        * Add option to save backup of FP32 weights to allow re-quantization
          after quantizer was created.
        * Add option to clip weights in addition to activations
        * Fix fusions to not occur only when activations aren't quantized
        * RangeLinearFakeQuantWrapper:
          * Make inputs quantization optional
          * In case of ReLU + ACIQ, clip according to input stats
      
      * Data loaders:
        * Add option to not load train set at all from disk (to speed up
          loading time in post-training runs)
        * Modified "image_classifier.py" accordingly
      Unverified
      0b493fd3
  11. Jan 19, 2020
  12. Jan 18, 2020
  13. Jan 15, 2020
    • Guy Jacob's avatar
      Fix scale factor calculation in symmetric quantization (#463) · 78255ee0
      Guy Jacob authored
      (we use 8-bit values below, but this applies to any bit-width)
      * We use the notion of "full" and "restricted" quantized range for
        symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
      * "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
      * Until now, when doing symmetric quantization we assumed a "full"
        range when saturating after quantization, but calculated the scale
        factor as if the range was restricted. This means we weren't making
        full utilization of the quantized range.
      * On the other hand, in some other implementations of quantization (e.g.
        TensorFlow), the "restricted" range is used.
      * So, we make it an option to use either the proper "full" range
        (q_min = -128) or "restricted" range (q_min = -127).
      * LinearQuantMode.SYMMETRIC now means the "full" range is used, and
        added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
        range.
      * Updated tests and documentation.
      Unverified
      78255ee0
  14. Jan 06, 2020
  15. Dec 30, 2019
  16. Dec 29, 2019
  17. Dec 26, 2019
  18. Dec 18, 2019
    • Bar's avatar
      IFM sparsity collector (#443) · cc50035e
      Bar authored
      Add directionality to SummaryActivationStatsCollector to allow collection of statistics on incoming and outgoing activations/feature-maps; instead of just outgoing activations.
      
      Also includes some code refactoring.
      cc50035e
  19. Dec 12, 2019
  20. Dec 11, 2019
  21. Dec 09, 2019
  22. Dec 08, 2019
    • Guy Jacob's avatar
      Enable weights/activations-only PTQ for conv/linear modules (#439) · 952028d0
      Guy Jacob authored
      * Weights-only PTQ:
        * Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in
          which case it'll act as a simple pass-through during forward
        * In RangeLinearQuantParamLayerWrapper, if bits_activations is None
          and num_bits_params > 0, Perform quant and de-quant of the
          parameters instead of just quant.
      * Activations-only PTQ:
        * Enable activations only quantization for conv/linear modules. When
          PostTrainLinearQuantizer detects # bits != None for activations 
          and # bits == None for weights, a fake-quantization wrapper will
          be used.
      * Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command
        line arguments to invoke weights/activations-only quantization,
        respectively.
      * Minor refactoring for clarity in PostTrainLinearQuantizer's replace_*
        functions
      Unverified
      952028d0
    • Guy Jacob's avatar
      Update PTQ ResNet-50 command line results · 326d172f
      Guy Jacob authored
      Results changed following commit 9e7ef987 (#402)
      Unverified
      326d172f
  23. Dec 03, 2019
  24. Dec 02, 2019
  25. Nov 28, 2019
Loading