-
- Downloads
Enable weights/activations-only PTQ for conv/linear modules (#439)
* Weights-only PTQ: * Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in which case it'll act as a simple pass-through during forward * In RangeLinearQuantParamLayerWrapper, if bits_activations is None and num_bits_params > 0, Perform quant and de-quant of the parameters instead of just quant. * Activations-only PTQ: * Enable activations only quantization for conv/linear modules. When PostTrainLinearQuantizer detects # bits != None for activations and # bits == None for weights, a fake-quantization wrapper will be used. * Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command line arguments to invoke weights/activations-only quantization, respectively. * Minor refactoring for clarity in PostTrainLinearQuantizer's replace_* functions
Showing
- distiller/quantization/range_linear.py 125 additions, 73 deletionsdistiller/quantization/range_linear.py
- examples/quantization/post_train_quant/command_line.md 2 additions, 2 deletionsexamples/quantization/post_train_quant/command_line.md
- tests/test_post_train_quant.py 61 additions, 0 deletionstests/test_post_train_quant.py
Loading
Please register or sign in to comment