Skip to content
Snippets Groups Projects
  • Guy Jacob's avatar
    78255ee0
    Fix scale factor calculation in symmetric quantization (#463) · 78255ee0
    Guy Jacob authored
    (we use 8-bit values below, but this applies to any bit-width)
    * We use the notion of "full" and "restricted" quantized range for
      symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
    * "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
    * Until now, when doing symmetric quantization we assumed a "full"
      range when saturating after quantization, but calculated the scale
      factor as if the range was restricted. This means we weren't making
      full utilization of the quantized range.
    * On the other hand, in some other implementations of quantization (e.g.
      TensorFlow), the "restricted" range is used.
    * So, we make it an option to use either the proper "full" range
      (q_min = -128) or "restricted" range (q_min = -127).
    * LinearQuantMode.SYMMETRIC now means the "full" range is used, and
      added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
      range.
    * Updated tests and documentation.
    Fix scale factor calculation in symmetric quantization (#463)
    Guy Jacob authored
    (we use 8-bit values below, but this applies to any bit-width)
    * We use the notion of "full" and "restricted" quantized range for
      symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
    * "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
    * Until now, when doing symmetric quantization we assumed a "full"
      range when saturating after quantization, but calculated the scale
      factor as if the range was restricted. This means we weren't making
      full utilization of the quantized range.
    * On the other hand, in some other implementations of quantization (e.g.
      TensorFlow), the "restricted" range is used.
    * So, we make it an option to use either the proper "full" range
      (q_min = -128) or "restricted" range (q_min = -127).
    * LinearQuantMode.SYMMETRIC now means the "full" range is used, and
      added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
      range.
    * Updated tests and documentation.
components NaN GiB