tests/test_ptq_pytorch_convert.py · c2ea77f68eb53de84f71beab9b7fa3d7eeea5a21 · llvm / distiller

5 years ago

Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f

Guy Jacob authored 5 years ago

Convert Distiller PTQ models to "native" PyTorch PTQ (#458)

* New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
* Can also be called from PostTrainLinearQuantizer instance:
    quantizer.convert_to_pytorch()
* Can also trigger from command line in image classification sample
* Can save/load converted modules via apputils.load/save_checkpoint
* Added Jupyter notebook tutorial

* Converted modules have only the absolutely necessary quant-dequant
  operations. For a fully quantized model, this means just quantization
  of model input and de-quantization of model output. If a user keeps
  specific internal layers in FP32, quant-dequant operations are added
  as needed
* Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
  take care of preventing overflows (aka "reduce_range" in the PyTorch
  API)

Unverified

cdc1775f

History

Convert Distiller PTQ models to "native" PyTorch PTQ (#458)

Guy Jacob authored 5 years ago

Convert Distiller PTQ models to "native" PyTorch PTQ (#458)

* New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
* Can also be called from PostTrainLinearQuantizer instance:
    quantizer.convert_to_pytorch()
* Can also trigger from command line in image classification sample
* Can save/load converted modules via apputils.load/save_checkpoint
* Added Jupyter notebook tutorial

* Converted modules have only the absolutely necessary quant-dequant
  operations. For a fully quantized model, this means just quantization
  of model input and de-quantization of model output. If a user keeps
  specific internal layers in FP32, quant-dequant operations are added
  as needed
* Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
  take care of preventing overflows (aka "reduce_range" in the PyTorch
  API)