Skip to content
Snippets Groups Projects
  1. Feb 23, 2020
  2. Feb 09, 2020
  3. Feb 06, 2020
    • Guy Jacob's avatar
      Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f
      Guy Jacob authored
      Convert Distiller PTQ models to "native" PyTorch PTQ (#458)
      
      * New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
      * Can also be called from PostTrainLinearQuantizer instance:
          quantizer.convert_to_pytorch()
      * Can also trigger from command line in image classification sample
      * Can save/load converted modules via apputils.load/save_checkpoint
      * Added Jupyter notebook tutorial
      
      * Converted modules have only the absolutely necessary quant-dequant
        operations. For a fully quantized model, this means just quantization
        of model input and de-quantization of model output. If a user keeps
        specific internal layers in FP32, quant-dequant operations are added
        as needed
      * Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
        take care of preventing overflows (aka "reduce_range" in the PyTorch
        API)
      cdc1775f
  4. Feb 03, 2020
  5. Feb 02, 2020
    • Lev Zlotnik's avatar
      Loss Aware Post Train Quantization search (#432) · 0b493fd3
      Lev Zlotnik authored
      "Loss Aware Post-Training Quantization" (Nahshan et al., 2019)
      
      Paper: https://arxiv.org/abs/1911.07190 
      Reference implementation:
        https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq
      
      Proper documentation is still TODO, for now see the example YAML file
      at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'
      
      * Implemented in distiller/quantization/ptq_coordinate_search.py
      * At the moment that file both the model-independent algorithm
        implementation and image-classification specific sample script.
        Still TODO: Refactor that
      
      * Post train quantization changes (range_linear):
        * Added getters/setters for quantization parameters (scale/zero_point)
          and clipping values
        * Add option to save backup of FP32 weights to allow re-quantization
          after quantizer was created.
        * Add option to clip weights in addition to activations
        * Fix fusions to not occur only when activations aren't quantized
        * RangeLinearFakeQuantWrapper:
          * Make inputs quantization optional
          * In case of ReLU + ACIQ, clip according to input stats
      
      * Data loaders:
        * Add option to not load train set at all from disk (to speed up
          loading time in post-training runs)
        * Modified "image_classifier.py" accordingly
      0b493fd3
  6. Jan 19, 2020
  7. Jan 18, 2020
  8. Jan 15, 2020
    • Guy Jacob's avatar
      Fix scale factor calculation in symmetric quantization (#463) · 78255ee0
      Guy Jacob authored
      (we use 8-bit values below, but this applies to any bit-width)
      * We use the notion of "full" and "restricted" quantized range for
        symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
      * "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
      * Until now, when doing symmetric quantization we assumed a "full"
        range when saturating after quantization, but calculated the scale
        factor as if the range was restricted. This means we weren't making
        full utilization of the quantized range.
      * On the other hand, in some other implementations of quantization (e.g.
        TensorFlow), the "restricted" range is used.
      * So, we make it an option to use either the proper "full" range
        (q_min = -128) or "restricted" range (q_min = -127).
      * LinearQuantMode.SYMMETRIC now means the "full" range is used, and
        added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
        range.
      * Updated tests and documentation.
      78255ee0
  9. Jan 06, 2020
  10. Dec 30, 2019
  11. Dec 29, 2019
  12. Dec 26, 2019
  13. Dec 18, 2019
    • Bar's avatar
      IFM sparsity collector (#443) · cc50035e
      Bar authored
      Add directionality to SummaryActivationStatsCollector to allow collection of statistics on incoming and outgoing activations/feature-maps; instead of just outgoing activations.
      
      Also includes some code refactoring.
      cc50035e
  14. Dec 12, 2019
  15. Dec 11, 2019
  16. Dec 09, 2019
  17. Dec 08, 2019
    • Guy Jacob's avatar
      Enable weights/activations-only PTQ for conv/linear modules (#439) · 952028d0
      Guy Jacob authored
      * Weights-only PTQ:
        * Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in
          which case it'll act as a simple pass-through during forward
        * In RangeLinearQuantParamLayerWrapper, if bits_activations is None
          and num_bits_params > 0, Perform quant and de-quant of the
          parameters instead of just quant.
      * Activations-only PTQ:
        * Enable activations only quantization for conv/linear modules. When
          PostTrainLinearQuantizer detects # bits != None for activations 
          and # bits == None for weights, a fake-quantization wrapper will
          be used.
      * Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command
        line arguments to invoke weights/activations-only quantization,
        respectively.
      * Minor refactoring for clarity in PostTrainLinearQuantizer's replace_*
        functions
      952028d0
    • Guy Jacob's avatar
      Update PTQ ResNet-50 command line results · 326d172f
      Guy Jacob authored
      Results changed following commit 9e7ef987 (#402)
      326d172f
  18. Dec 03, 2019
  19. Dec 02, 2019
  20. Nov 28, 2019
  21. Nov 27, 2019
  22. Nov 25, 2019
    • Neta Zmora's avatar
      Resnet50 early-exit update · 8b341593
      Neta Zmora authored
      Update the definition of the exits using info from Haim.
      
      This is still very unsatsifactory because we don't have working
      examples to show users :-(
      8b341593
  23. Nov 17, 2019
  24. Nov 16, 2019
    • Neta Zmora's avatar
      Add README.md files for APG and DropFilter · 70e26735
      Neta Zmora authored
      70e26735
    • Neta Zmora's avatar
      Remove duplicate YAML file · 49933144
      Neta Zmora authored
      49933144
    • Neta Zmora's avatar
      Cifar models: remove explicit parameters initialization · 6913687f
      Neta Zmora authored
      except for the case of VGG, our parameter initialization code was
      matched the default pytorch initialization (per torch.nn operation),
      so writing the initialization code ourselves can only lead to
      more code and maintenance; and also we would not benefit from
      improvements that occur at the pytorch level (e.g. if FB finds a
      better initialization for nn.conv2d than today's kaiming init, we
      would not benefit).
      The VGG initialization we had was "suspicious" and so reverting
      to the default seems reasonable.
      6913687f
  25. Nov 14, 2019
Loading