docs/algo_quantization.html · 4ad16ef00f9ea90c0d7834667bf86b12e795c12e · llvm / distiller

5 years ago

Fix scale factor calculation in symmetric quantization (#463) · 78255ee0

Guy Jacob authored 5 years ago

(we use 8-bit values below, but this applies to any bit-width)
* We use the notion of "full" and "restricted" quantized range for
  symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
* "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
* Until now, when doing symmetric quantization we assumed a "full"
  range when saturating after quantization, but calculated the scale
  factor as if the range was restricted. This means we weren't making
  full utilization of the quantized range.
* On the other hand, in some other implementations of quantization (e.g.
  TensorFlow), the "restricted" range is used.
* So, we make it an option to use either the proper "full" range
  (q_min = -128) or "restricted" range (q_min = -127).
* LinearQuantMode.SYMMETRIC now means the "full" range is used, and
  added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
  range.
* Updated tests and documentation.

Unverified

78255ee0

History

Fix scale factor calculation in symmetric quantization (#463)

Guy Jacob authored 5 years ago

(we use 8-bit values below, but this applies to any bit-width)
* We use the notion of "full" and "restricted" quantized range for
  symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
* "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
* Until now, when doing symmetric quantization we assumed a "full"
  range when saturating after quantization, but calculated the scale
  factor as if the range was restricted. This means we weren't making
  full utilization of the quantized range.
* On the other hand, in some other implementations of quantization (e.g.
  TensorFlow), the "restricted" range is used.
* So, we make it an option to use either the proper "full" range
  (q_min = -128) or "restricted" range (q_min = -127).
* LinearQuantMode.SYMMETRIC now means the "full" range is used, and
  added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
  range.
* Updated tests and documentation.

components NaN GiB