Commits · d14974b7b546b85243580a368ac8912c135b80e4 · llvm / distiller · GitLab

Snippets Groups Projects

Nov 27, 2019

add tolerance (=0.05) when comparing test results · 1cf7e529

Neta Zmora authored 5 years ago

Small variances can occur when using different cudnn versions,
even when the environment and distiller version is the same.

1cf7e529

Nov 14, 2019

PyTorch 1.3 Support (#426) · b8b4cf32

Guy Jacob authored 5 years ago

* summary_graph.py:
  * Change ONNX op.uniqueName() to op.debugName()
  * Removed scope-naming workaround which isn't needed in PyTorch 1.3
* Tests:
  * Naming of trace entries changed in 1.3. Fixed SummaryGraph unit
    test that checked that
  * Adjusted expected values in full_flow_tests
  * Adjusted tolerance in test_sim_bn_fold
  * Filter some new warnings

b8b4cf32

Nov 13, 2019

image_classifier.py: PTQ stats collection and eval in same run (#346) · fb98377e

Bar authored 5 years ago

* Previous implementation:
  * Stats collection required a separate run with `-qe-calibration`.
  * Specifying `--quantize-eval` without `--qe-stats-file` triggered
    dynamic quantization.
  * Running with `--quantize-eval --qe-calibration <num>` only ran
    stats collection and ignored --quantize-eval.

* New implementation:
  * Running `--quantize-eval --qe-calibration <num>` will now 
    perform stats collection according to the calibration flag,
    and then quantize the model with the collected stats (and
    run evaluation).
  * Specifying `--quantize-eval` without `--qe-stats-file` will
    trigger the same flow as in the bullet above, as if 
    `--qe-calibration 0.05` was used (i.e. 5% of the test set will
    be used for stats).
  * Added new flag: `--qe-dynamic`. From now, to do dynamic 
    quantization, need to explicitly run:
    `--quantize-eval --qe-dynamic`
  * As before, can still run `--qe-calibration` without 
    `--quantize-eval` to perform "stand-alone" stats collection
  * The following flags, which all represent different ways to
    control creation of stats or use of existing stats, are now
    mutually exclusive:
    `--qe-calibration`, `-qe-stats-file`, `--qe-dynamic`,
    `--qe-config-file`

fb98377e

Nov 10, 2019

tests/full_flow_tests.py: improve early-exit test robustness · 49a2a967

Neta Zmora authored 5 years ago

EE runs emit more statistics than the regular classification pipeline,
and it is more robust to validate more of the log output for correctness
validation.

49a2a967

Nov 07, 2019

Fix Early-exit code · fc62caab

Neta Zmora authored 5 years ago

Fix the EE code so that it works with the current 'master' branch,
and add a test for high-level EE regression

fc62caab

Oct 07, 2019

Post-Train Quant: Greedy Search + Proper mixed-settings handling (#402) · 9e7ef987

Guy Jacob authored 5 years ago


* Greedy search script for post-training quantization settings
  * Iterates over each layer in the model in order. For each layer,
    checks a user-defined set of quantization settings and chooses
    the best one based on validation accuracy
  * Provided sample that searches for best activations-clipping
    mode per layer, on image classification models

* Proper handling of mixed-quantization settings in post-train quant:
  * By default, the quantization settings for each layer apply only
    to output quantization
  * Propagate quantization settings for activations tensors through
    the model during execution
  * For non-quantized inputs to layers that require quantized inputs,
    fall-back to quantizing according to the settings used for the
    output
  * In addition, provide mechanism to override inputs quantization
    settings via the YAML configuration file
  * By default all modules are quantized now. For module types that
    don't have a dedicated quantized implementation, "fake"
    quantization is performed

* Misc. Changes
  * Fuse ReLU/ReLU6 to predecessor during post-training quantization
  * Fixes to ACIQ clipping in the half-range case

Co-authored-by: Lev Zlotnik <lev.zlotnik@intel.com>
Co-authored-by: Guy Jacob <guy.jacob@intel.com>

9e7ef987

Aug 20, 2019

full_flow_tests.py: Added a pruning test · 67db927d

Neta Zmora authored 5 years ago

This test uses MNIST for faster execution and test various
pruners and their scheduling.

67db927d

Aug 07, 2019
- full_flow_tests.py: relax verification of sensitivity.png · bbdb19de
  Neta Zmora authored 5 years ago
  
  bbdb19de
Aug 06, 2019

AMC and other refactoring - large merge (#339) · 02054da1

Neta Zmora authored 5 years ago

*An implementation of AMC (the previous implementation
 code has moved to a new location under 
/distiller/examples/auto_compression/amc.  AMC is aligned
with the ‘master’ branch of Coach.
*compress_classifier.py is refactored.  The base code moved
to /distiller/apputils/image_classifier.py.  Further refactoring
will follow.
We want to provide a simple and small API to the basic features of
a classifier-compression application.
This will help applications that want to use the make features of a
classifier-compression application, without the standard training
regiment.
AMC is one example of a stand-alone application that needs to leverage
the capabilities of a classifier-compression application, but is currently
coupled to `compress_classifier.py`.
`multi-finetune.py` is another example.
* ranked_structures_pruner.py:
** Added support for grouping channels/filters
Sometimes we want to prune a group of structures: e.g. groups of
8-channels.  This feature does not force the groups to be adjacent,
so it is more like a set of structures.  E.g. in the case of pruning
channels from a 64-channels convolution, grouped by 8 channels, we 
will prune exactly one of 0/8/16/24/32/40/48/56 channels.  I.e. 
always a multiple of 8-channels, excluding the set of all 64 channels.
** Added FMReconstructionChannelPruner – this is channel
pruning using L1-magnitude to rank and select channels to
remove, and feature-map reconstruction to improve the
resilience to the pruning.
* Added a script to run multiple instances of an 
experiment, in different processes:
 examples/classifier_compression/multi-run.py
* Set the seed value even when not specified by the command-line
arguments, so that we can try and recreate the session.
* Added pruning ranking noise -
Ranking noise introduces Gaussian noise when ranking channels/filters
using Lp-norm.  The noise is introduced using the epsilon-greedy
methodology, where ranking using exact Lp-norm is considered greedy.
* Added configurable rounding of pruning level: choose whether to 
Round up/down when rounding the number of structures to prune 
(rounding is always to an integer).

02054da1

Jul 10, 2019

Post-Train Quantization: BN folding and "net-aware quantization" (#313) · 43548deb

Guy Jacob authored 5 years ago

* "Net-aware quantization" - using the term coined in
  https://arxiv.org/abs/1811.09886. (section 3.2.2).
  Refers to considering sequences of modules when quantizing. This 
  isn't exactly layer fusion - we modify activation stats prior to
  setting quantization parameters, to make sure that when a module
  is followed by certain activation functions, only the relevant
  ranges are quantized. We do this for:
    * ReLU - Clip all negative values
    * Tanh / Sigmoid - Clip according to the (approximated) saturation
      values for these functions. We use [-4, 4] for tanh and [-6, 6]
      for sigmoid.

* Perform batch-norm folding before post-training quantization.
  Batch-norm parameters are folded into the parameters of the previous
  layer and the BN layer is replaced with an identity module.

* Both BN folding and "net-aware" are now automatically executed
  in PostTrainLinearQuantizer (details of this change below)

* BN folding enabled by new generic mechanism to "fuse" module
  sequences (at the Python API level)
    * First module in sequence is replaced/modified by a user-provided
      function, rest of moudles replaced with nn.Identity

* Quantizer changes:
  * Optionally create adjacency map during prepare_model
  * Subclasses may enforce adjacency map creation
  * Refatcoring: Replace _prepare_model_impl with pre and post
    override-able "callbacks", so core functionality is always executed

* PostTrainLinearQuantizer Changes:
  * Enforce creation of adjacency map. This means users must now pass a
    dummy input to PostTrainLinearQuantizer.prepare_model
  * Before module replacement - Apply BN folding and stats updates according
    to net-aware quantization

* Updated the language model quantization tutorial to reflect the new
  functionality

* Updated the image classification post-train quantization samples
  (command line and YAML)

* Other changes:
  * Distller LSTM implementation:
    Replace the ModuleList for cells with a plain list. The PyTorch trace
    mechanism doesn't "see" ModuleList objects, it only sees the 
    contained modules. This means that the "scopeName" of these modules
    isn't complete, which makes it impossible to match op names in 
    SummaryGraph to modules in the Python model.
  * ActivationStatsCollector: Ignore nn.Identity modules

43548deb

Jul 04, 2019

Switch to PyTorch 1.1.0 (#306) · 032b1f74

Guy Jacob authored 5 years ago

* PyTorch 1.1.0 now required
  - Moved other dependencies to up-to-date versions as well
* Adapt LR scheduler to PyTorch 1.1 API changes:
  - Change lr_scheduler.step() calls to succeed validate calls,
    during training
  - Pass to lr_scheduler.step() caller both loss and top1
    (Resolves issue #240)
* Adapt thinning for PyTorch 1.1 semantic changes
  - **KNOWN ISSUE**: When a thinning recipe is applied, in certain
    cases PyTorch displays this warning:
    "UserWarning: non-inplace resize is deprecated".
    To be fixed later
* SummaryGraph: Workaround for new scope name issue from PyTorch 1.1.0
* Adapt to updated PyTest version:
  - Stop using deprecated 'message' parameter of pytest.raises(),
    use pytest.fail() instead
  - Make sure only a single test case per pytest.raises context
* Move PyTorch version check to root __init__.py 
  - This means the version each checked when Distiller is first
    imported. A RuntimeError is raised if the version is wrong.
* Updates to parameter_histograms notebook:
  - Replace deprecated normed argument with density
  - Add sparsity rate to plot title
  - Load model in CPU

032b1f74

Jul 03, 2019
- Dump outputs in run log dir instead of script dir · 34f9a55b
  Guy Jacob authored 5 years ago
  
  34f9a55b
Jun 03, 2019

[Breaking] PTQ: Removed special handling of clipping overrides · 3cde6c5e

Lev Zlotnik authored 5 years ago

* In PostTrainLinearQuantizer - moved 'clip_acts' and 'clip_n_stds'
  to overrides, removed 'no_clip_layers' parameter from __init__
* The 'no_clip_layers' command line argument REMAINS, handled in 
  PostTrainLinearQuantizer.from_args()
* Removed old code from comments, fixed warnings in 
  test_post_train_quant.py
* Updated tests
* Update post-train quant sample YAML

3cde6c5e

Apr 18, 2019

Remove single worker limitation in deterministic mode (#227) · 8c5de42c

Bar authored 5 years ago

Also:
* Single worker limitation not needed anymore, been fixed in PyTorch
  since v0.4.0 (https://github.com/pytorch/pytorch/pull/4640)
* compress_classifier.py: If run in evaluation mode (--eval), enable
  deterministic mode.
* Call utils.set_deterministic at data loaders creation if
  deterministic argument is set (don't assume user calls it outside)
* Disable CUDNN benchmark mode in utils.set_deterministic
  (https://pytorch.org/docs/stable/notes/randomness.html#cudnn)

8c5de42c

Feb 26, 2019

PyTorch 1.0.0 support + Proper Packaging (Release 0.3) (#144) · 62862a08

Lev Zlotnik authored 6 years ago

Not backward compatible - re-installation is required

* Fixes for PyTorch==1.0.0
* Refactoring folder structure
* Update installation section in docs

62862a08

Feb 10, 2019

Load different random subset of dataset on each epoch (#149) · 4b1d0c89

Guy Jacob authored 6 years ago

* For CIFAR-10 / ImageNet only
* Refactor data_loaders.py, reduce code duplication
* Implemented custom sampler
* Integrated in image classification sample
* Since we now shuffle the test set, had to update expected results
  in 2 full_flow_tests that do evaluation

4b1d0c89

Dec 04, 2018

Range-Based Linear Quantization Features (#95) · 907a6f04

Guy Jacob authored 6 years ago

* Asymmetric post-training quantization (only symmetric supported so until now)
* Quantization aware training for range-based (min-max) symmetric and asymmetric quantization
* Per-channel quantization support in both training and post-training
* Added tests and examples
* Updated documentation

907a6f04

Nov 22, 2018

Fix Issue 79 (#81) · acbb4b4d

Neta Zmora authored 6 years ago

* Fix issue #79

Change the default values so that the following scheduler meta-data keys
are always defined: 'starting_epoch', 'ending_epoch', 'frequency'

* compress_classifier.py: add a new argument

Allow the specification, from the command line arguments,  of the range of
pruning levels scanned when doing sensitivity analysis

* Add regression test for issue #79

acbb4b4d

Nov 01, 2018
- Averaging-based activations clipping in SymmetricLinearQuantizer (#56) · 68591870
  Guy Jacob authored 6 years ago
  
  * Added command line arguments for this and other post-training quantization settings in image classification sample.
  Unverified
  
  68591870
Jul 22, 2018

PACT quantizer (#30) · df9a00ce

Gal Novik authored 6 years ago

* Adding PACT quantization method
* Move logic modifying the optimizer due to changes the quantizer makes into the Quantizer itself
* Updated documentation and tests

df9a00ce

Jul 17, 2018

Quantizer tests, fixes and docs update · 6b166cec

Guy Jacob authored 6 years ago

* Add Quantizer unit tests
* Require 'bits_overrides' to be OrderedDict to support overlapping
  patterns in a predictable manner + update documentation to reflect this
* Quantizer class cleanup
  * Use "public" nn.Module APIs instead of protected attributes
  * Call the builtins set/get/delattr instead of the class special methods
    (__***__)
  * Fix issues reported in #24
* Bug in RangeLinearQuantParamLayerWrapper - add explicit override of
  pre_quantized_forward accpeting single input (#15)
* Add DoReFa test to full_flow_tests

6b166cec

May 14, 2018
- 8-bit Quantization - Save model + add test + updated docs (#3) · 443e7381
  Guy Jacob authored 6 years ago
  
  443e7381
Apr 24, 2018
- first commit · 6eef69b5
  Neta Zmora authored 6 years ago
  
  6eef69b5