-
- Downloads
Enable weights/activations-only PTQ for conv/linear modules (#439)
* Weights-only PTQ:
* Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in
which case it'll act as a simple pass-through during forward
* In RangeLinearQuantParamLayerWrapper, if bits_activations is None
and num_bits_params > 0, Perform quant and de-quant of the
parameters instead of just quant.
* Activations-only PTQ:
* Enable activations only quantization for conv/linear modules. When
PostTrainLinearQuantizer detects # bits != None for activations
and # bits == None for weights, a fake-quantization wrapper will
be used.
* Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command
line arguments to invoke weights/activations-only quantization,
respectively.
* Minor refactoring for clarity in PostTrainLinearQuantizer's replace_*
functions
Showing
- distiller/quantization/range_linear.py 125 additions, 73 deletionsdistiller/quantization/range_linear.py
- examples/quantization/post_train_quant/command_line.md 2 additions, 2 deletionsexamples/quantization/post_train_quant/command_line.md
- tests/test_post_train_quant.py 61 additions, 0 deletionstests/test_post_train_quant.py
Loading
Please register or sign in to comment