Commits · 4ad16ef00f9ea90c0d7834667bf86b12e795c12e · llvm / distiller · GitLab

Snippets Groups Projects

Jan 15, 2020

Fix scale factor calculation in symmetric quantization (#463) · 78255ee0

Guy Jacob authored 5 years ago

(we use 8-bit values below, but this applies to any bit-width)
* We use the notion of "full" and "restricted" quantized range for
  symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
* "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
* Until now, when doing symmetric quantization we assumed a "full"
  range when saturating after quantization, but calculated the scale
  factor as if the range was restricted. This means we weren't making
  full utilization of the quantized range.
* On the other hand, in some other implementations of quantization (e.g.
  TensorFlow), the "restricted" range is used.
* So, we make it an option to use either the proper "full" range
  (q_min = -128) or "restricted" range (q_min = -127).
* LinearQuantMode.SYMMETRIC now means the "full" range is used, and
  added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
  range.
* Updated tests and documentation.

78255ee0

Aug 08, 2019
- Point to GNMT example in docs · 7b5fdefe
  Guy Jacob authored 6 years ago
  
  7b5fdefe
Aug 04, 2019
- Add documentation section on preparing a model for quantization · 7fee2c9d
  Guy Jacob authored 6 years ago
  
  7fee2c9d
Jul 08, 2019
- Add links to language model quantization notebook in README and docs · 81047f5d
  Guy Jacob authored 6 years ago
  
  81047f5d
May 19, 2019

Post-training quantization: Scale factor approximation (#261) · 66c0ad1d

Guy Jacob authored 6 years ago

* Added scale factor approximation in post-training quantization using
  integer multiply + shift. # of bits for integer multiplier is user
  configurable
* Updated documentation
* Updated post-train quant command line examples readme file

66c0ad1d

Apr 14, 2019

Post-train quant: Extend acts clipping functionality (#225) · 437e270b

Guy Jacob authored 6 years ago

* Some refactoring to enable multiple clipping methods
* BREAKING: clip_acts as a boolean flag (either in command line
  or in function signature) will fail. Error message with valid
  values from is displayed.
* Implemented clipping activations at mean + N * std
  (N is user configurable)
* Additional tests
* Updated docs

437e270b

Docs: Fix broken images and links · b9207bf7
Guy Jacob authored 6 years ago

b9207bf7

Apr 01, 2019

Quantizer: Specify # bias bits + custom overrides (BREAKING) (#178) · 5271625a

Lev Zlotnik authored 6 years ago

* Bias handling:
  * Add 'bits_bias' parameter to explicitly specify # of bits for bias,
    similar to weights and activations.
  * BREAKING: Remove the now redundant 'quantize_bias' boolean parameter
* Custom overrides:
  * Expand the semantics of the overrides dict to allow overriding of
    other parameters in addition to bit-widths
  * Functions registered in the quantizer's 'replacement_factory' can
    define keyword arguments. Non bit-width entries in the overrides
    dict will be checked against the function signature and passed
  * BREAKING:
    * Changed the name of 'bits_overrides' to simply 'overrides'
    * Bit-width overrides must now be defined using the full parameter
      names - 'bits_activations/weights/bias' instead of the short-hands
      'acts' and 'wts' which were used so far.
  * Added/updated relevant tests
  * Modified all quantization YAMLs under 'examples' to reflect 
    these changes
  * Updated docs

5271625a

Mar 29, 2019
- Fixed a typo in te quantization documentation (#207) · f5987f9a
  Songyi Blair Han authored 6 years ago
  
  f5987f9a
Feb 11, 2019

Post-train quant based on stats + additional modules quantized (#136) · 28a8ee18

Guy Jacob authored 6 years ago

Summary of changes:
(1) Post-train quantization based on pre-collected statistics
(2) Quantized concat, element-wise addition / multiplication and embeddings
(3) Move post-train quantization command line args out of sample code
(4) Configure post-train quantization from YAML for more fine-grained control

(See PR #136 for more detailed changes descriptions)

28a8ee18

Dec 11, 2018
- Updated early-exit docs (from @haim-barad) · 8bcaaa53
  Guy Jacob authored 6 years ago
  
  8bcaaa53
Dec 06, 2018

Documentation refactoring · 178c8c49

Neta Zmora authored 6 years ago

- Moved the Language model and struct pruning tutorials from the Wiki to
the HTML documentation.  Love the ease of Wiki, but GitHub doesn't let
Google crawl these pages, and users can't open PRs on Wiki pages.

- Updated the pruning algorithms documentation

178c8c49

Dec 04, 2018

Range-Based Linear Quantization Features (#95) · 907a6f04

Guy Jacob authored 6 years ago

* Asymmetric post-training quantization (only symmetric supported so until now)
* Quantization aware training for range-based (min-max) symmetric and asymmetric quantization
* Per-channel quantization support in both training and post-training
* Added tests and examples
* Updated documentation

907a6f04

Nov 07, 2018
- Documentation: add github pages documentation for Early Exit · 5681541f
  Neta Zmora authored 6 years ago
  
  5681541f
Sep 03, 2018

Add knowledge distillation flow (#41) · c9794e4a

Guy Jacob authored 7 years ago

* Implemented as a Policy
* Integrated in image classification sample
* Updated docs and README

c9794e4a

Jul 22, 2018

PACT quantizer (#30) · df9a00ce

Gal Novik authored 7 years ago

* Adding PACT quantization method
* Move logic modifying the optimizer due to changes the quantizer makes into the Quantizer itself
* Updated documentation and tests

df9a00ce

Jul 01, 2018
- Fix symmetric linear quantization math derivation in docs (#12) · 19d33c50
  Guy Jacob authored 7 years ago
  
  * Scale of bias and parentheses were wrong
  Unverified
  
  19d33c50
Jun 21, 2018
- Updated docs related to quantization · 3658374e
  Guy Jacob authored 7 years ago
  
  3658374e
May 14, 2018
- 8-bit Quantization - Save model + add test + updated docs (#3) · 443e7381
  Guy Jacob authored 7 years ago
  
  443e7381
Apr 30, 2018
- Additional quantization docs + fixes · 7bbfd12b
  Guy Jacob authored 7 years ago
  
  7bbfd12b
Apr 28, 2018
- fix typo: Jupyter spelled as Jupiter · ebb89126
  Neta Zmora authored 7 years ago
  
  ebb89126
Apr 24, 2018
- small documentation touchups · 7fbde765
  Neta Zmora authored 7 years ago
  
  7fbde765
- small documentation touchups · cb79e100
  Neta Zmora authored 7 years ago
  
  cb79e100
- Fix README links · 0ecd205a
  Neta Zmora authored 7 years ago
  
  0ecd205a
- first commit · 6eef69b5
  Neta Zmora authored 7 years ago
  
  6eef69b5