-
- Downloads
Fix scale factor calculation in symmetric quantization (#463)
(we use 8-bit values below, but this applies to any bit-width) * We use the notion of "full" and "restricted" quantized range for symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342) * "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127] * Until now, when doing symmetric quantization we assumed a "full" range when saturating after quantization, but calculated the scale factor as if the range was restricted. This means we weren't making full utilization of the quantized range. * On the other hand, in some other implementations of quantization (e.g. TensorFlow), the "restricted" range is used. * So, we make it an option to use either the proper "full" range (q_min = -128) or "restricted" range (q_min = -127). * LinearQuantMode.SYMMETRIC now means the "full" range is used, and added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted" range. * Updated tests and documentation.
Showing
- distiller/quantization/q_utils.py 52 additions, 8 deletionsdistiller/quantization/q_utils.py
- distiller/quantization/range_linear.py 32 additions, 16 deletionsdistiller/quantization/range_linear.py
- docs-src/docs/algo_quantization.md 30 additions, 16 deletionsdocs-src/docs/algo_quantization.md
- docs/algo_quantization.html 57 additions, 14 deletionsdocs/algo_quantization.html
- docs/index.html 1 addition, 1 deletiondocs/index.html
- docs/search/search_index.json 1 addition, 1 deletiondocs/search/search_index.json
- docs/sitemap.xml 20 additions, 20 deletionsdocs/sitemap.xml
- docs/sitemap.xml.gz 0 additions, 0 deletionsdocs/sitemap.xml.gz
- examples/quantization/post_train_quant/command_line.md 4 additions, 4 deletionsexamples/quantization/post_train_quant/command_line.md
- tests/test_post_train_quant.py 37 additions, 35 deletionstests/test_post_train_quant.py
- tests/test_quant_utils.py 45 additions, 1 deletiontests/test_quant_utils.py
Loading
Please register or sign in to comment