Skip to content
Snippets Groups Projects
  • Guy Jacob's avatar
    cdc1775f
    Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f
    Guy Jacob authored
    Convert Distiller PTQ models to "native" PyTorch PTQ (#458)
    
    * New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
    * Can also be called from PostTrainLinearQuantizer instance:
        quantizer.convert_to_pytorch()
    * Can also trigger from command line in image classification sample
    * Can save/load converted modules via apputils.load/save_checkpoint
    * Added Jupyter notebook tutorial
    
    * Converted modules have only the absolutely necessary quant-dequant
      operations. For a fully quantized model, this means just quantization
      of model input and de-quantization of model output. If a user keeps
      specific internal layers in FP32, quant-dequant operations are added
      as needed
    * Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
      take care of preventing overflows (aka "reduce_range" in the PyTorch
      API)
    Convert Distiller PTQ models to "native" PyTorch PTQ (#458)
    Guy Jacob authored
    Convert Distiller PTQ models to "native" PyTorch PTQ (#458)
    
    * New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
    * Can also be called from PostTrainLinearQuantizer instance:
        quantizer.convert_to_pytorch()
    * Can also trigger from command line in image classification sample
    * Can save/load converted modules via apputils.load/save_checkpoint
    * Added Jupyter notebook tutorial
    
    * Converted modules have only the absolutely necessary quant-dequant
      operations. For a fully quantized model, this means just quantization
      of model input and de-quantization of model output. If a user keeps
      specific internal layers in FP32, quant-dequant operations are added
      as needed
    * Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
      take care of preventing overflows (aka "reduce_range" in the PyTorch
      API)