tests/test_post_train_quant.py · d14974b7b546b85243580a368ac8912c135b80e4 · llvm / distiller

5 years ago

Loss Aware Post Train Quantization search (#432) · 0b493fd3

Lev Zlotnik authored 5 years ago

"Loss Aware Post-Training Quantization" (Nahshan et al., 2019)

Paper: https://arxiv.org/abs/1911.07190 
Reference implementation:
  https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq

Proper documentation is still TODO, for now see the example YAML file
at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'

* Implemented in distiller/quantization/ptq_coordinate_search.py
* At the moment that file both the model-independent algorithm
  implementation and image-classification specific sample script.
  Still TODO: Refactor that

* Post train quantization changes (range_linear):
  * Added getters/setters for quantization parameters (scale/zero_point)
    and clipping values
  * Add option to save backup of FP32 weights to allow re-quantization
    after quantizer was created.
  * Add option to clip weights in addition to activations
  * Fix fusions to not occur only when activations aren't quantized
  * RangeLinearFakeQuantWrapper:
    * Make inputs quantization optional
    * In case of ReLU + ACIQ, clip according to input stats

* Data loaders:
  * Add option to not load train set at all from disk (to speed up
    loading time in post-training runs)
  * Modified "image_classifier.py" accordingly

Unverified

0b493fd3

History

Loss Aware Post Train Quantization search (#432)

Lev Zlotnik authored 5 years ago

"Loss Aware Post-Training Quantization" (Nahshan et al., 2019)

Paper: https://arxiv.org/abs/1911.07190 
Reference implementation:
  https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq

Proper documentation is still TODO, for now see the example YAML file
at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'

* Implemented in distiller/quantization/ptq_coordinate_search.py
* At the moment that file both the model-independent algorithm
  implementation and image-classification specific sample script.
  Still TODO: Refactor that

* Post train quantization changes (range_linear):
  * Added getters/setters for quantization parameters (scale/zero_point)
    and clipping values
  * Add option to save backup of FP32 weights to allow re-quantization
    after quantizer was created.
  * Add option to clip weights in addition to activations
  * Fix fusions to not occur only when activations aren't quantized
  * RangeLinearFakeQuantWrapper:
    * Make inputs quantization optional
    * In case of ReLU + ACIQ, clip according to input stats

* Data loaders:
  * Add option to not load train set at all from disk (to speed up
    loading time in post-training runs)
  * Modified "image_classifier.py" accordingly