Skip to content
Snippets Groups Projects
  1. Apr 30, 2020
    • Guy Jacob's avatar
      Knowledge distillation fixes (#503) · 32a7e4bf
      Guy Jacob authored
      Fixed two long-standing bugs in knowledge distillation:
       * Distillation loss needs to be scaled by T^2 (#122)
       * Use tensor.clone instead of new_tensor when caching student logits (#234)
      Updated example results and uploaded the script to generate them
      Unverified
      32a7e4bf
  2. Apr 28, 2020
  3. Apr 27, 2020
  4. Apr 22, 2020
  5. Apr 21, 2020
  6. Apr 20, 2020
    • Neta Zmora's avatar
      Add example code showing schedule specification using code. · 5b01a40c
      Neta Zmora authored
      This script shows how to specify a compression-schedule directly
      using Distiller's API, instead of using a YAML specification
      
      examples/scheduling_api/direct_api_pruning.py
      5b01a40c
    • Neta Zmora's avatar
      small tensor masking API refactoring (#499) · 68514d17
      Neta Zmora authored
      Added masking primitives:
       -mask_tensor
       -create_mask_threshold_criterion
       -create_mask_level_criterion
       -create_mask_sensitivity_criterion
      
       These APIs have a clearer name and communicate their
       responsibility better: create a tensor mask, based on
       some criterion.  Previously,
       distiller.pruning.create_mask_threshold_criterion was
       named distiller.threshold_mask which did not communicate
       well what this function did.
       Masking functionality is no longer hidden
       inside the Pruner instances, so they can be used directly
       by an application, or to compose new Pruner classes.
      
      Removed file distiller.pruning.pruner:
       -The base-class _ParameterPruner is useless and adds
       needless details to the implementation.
      
      AGP: Separated the pruning-rate schedule from the
       rest of the logic.  This allows us to mix-and-match different
       pruning-rate schedules (just like LR schedulers).
      Unverified
      68514d17
  7. Apr 16, 2020
  8. Apr 13, 2020
  9. Apr 12, 2020
  10. Mar 31, 2020
  11. Feb 26, 2020
  12. Feb 23, 2020
  13. Feb 17, 2020
  14. Feb 13, 2020
  15. Feb 09, 2020
  16. Feb 06, 2020
    • Guy Jacob's avatar
      Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f
      Guy Jacob authored
      Convert Distiller PTQ models to "native" PyTorch PTQ (#458)
      
      * New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
      * Can also be called from PostTrainLinearQuantizer instance:
          quantizer.convert_to_pytorch()
      * Can also trigger from command line in image classification sample
      * Can save/load converted modules via apputils.load/save_checkpoint
      * Added Jupyter notebook tutorial
      
      * Converted modules have only the absolutely necessary quant-dequant
        operations. For a fully quantized model, this means just quantization
        of model input and de-quantization of model output. If a user keeps
        specific internal layers in FP32, quant-dequant operations are added
        as needed
      * Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
        take care of preventing overflows (aka "reduce_range" in the PyTorch
        API)
      Unverified
      cdc1775f
  17. Feb 03, 2020
  18. Feb 02, 2020
    • Lev Zlotnik's avatar
      Loss Aware Post Train Quantization search (#432) · 0b493fd3
      Lev Zlotnik authored
      "Loss Aware Post-Training Quantization" (Nahshan et al., 2019)
      
      Paper: https://arxiv.org/abs/1911.07190 
      Reference implementation:
        https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq
      
      Proper documentation is still TODO, for now see the example YAML file
      at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'
      
      * Implemented in distiller/quantization/ptq_coordinate_search.py
      * At the moment that file both the model-independent algorithm
        implementation and image-classification specific sample script.
        Still TODO: Refactor that
      
      * Post train quantization changes (range_linear):
        * Added getters/setters for quantization parameters (scale/zero_point)
          and clipping values
        * Add option to save backup of FP32 weights to allow re-quantization
          after quantizer was created.
        * Add option to clip weights in addition to activations
        * Fix fusions to not occur only when activations aren't quantized
        * RangeLinearFakeQuantWrapper:
          * Make inputs quantization optional
          * In case of ReLU + ACIQ, clip according to input stats
      
      * Data loaders:
        * Add option to not load train set at all from disk (to speed up
          loading time in post-training runs)
        * Modified "image_classifier.py" accordingly
      Unverified
      0b493fd3
  19. Jan 19, 2020
  20. Jan 18, 2020
  21. Jan 15, 2020
    • Guy Jacob's avatar
      Fix scale factor calculation in symmetric quantization (#463) · 78255ee0
      Guy Jacob authored
      (we use 8-bit values below, but this applies to any bit-width)
      * We use the notion of "full" and "restricted" quantized range for
        symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
      * "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
      * Until now, when doing symmetric quantization we assumed a "full"
        range when saturating after quantization, but calculated the scale
        factor as if the range was restricted. This means we weren't making
        full utilization of the quantized range.
      * On the other hand, in some other implementations of quantization (e.g.
        TensorFlow), the "restricted" range is used.
      * So, we make it an option to use either the proper "full" range
        (q_min = -128) or "restricted" range (q_min = -127).
      * LinearQuantMode.SYMMETRIC now means the "full" range is used, and
        added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
        range.
      * Updated tests and documentation.
      Unverified
      78255ee0
Loading