Commits · fb98377e83045a4c6753a78b454abd30852fd5dd · llvm / distiller · GitLab

Snippets Groups Projects

Nov 13, 2019

image_classifier.py: PTQ stats collection and eval in same run (#346) · fb98377e

Bar authored 5 years ago

* Previous implementation:
  * Stats collection required a separate run with `-qe-calibration`.
  * Specifying `--quantize-eval` without `--qe-stats-file` triggered
    dynamic quantization.
  * Running with `--quantize-eval --qe-calibration <num>` only ran
    stats collection and ignored --quantize-eval.

* New implementation:
  * Running `--quantize-eval --qe-calibration <num>` will now 
    perform stats collection according to the calibration flag,
    and then quantize the model with the collected stats (and
    run evaluation).
  * Specifying `--quantize-eval` without `--qe-stats-file` will
    trigger the same flow as in the bullet above, as if 
    `--qe-calibration 0.05` was used (i.e. 5% of the test set will
    be used for stats).
  * Added new flag: `--qe-dynamic`. From now, to do dynamic 
    quantization, need to explicitly run:
    `--quantize-eval --qe-dynamic`
  * As before, can still run `--qe-calibration` without 
    `--quantize-eval` to perform "stand-alone" stats collection
  * The following flags, which all represent different ways to
    control creation of stats or use of existing stats, are now
    mutually exclusive:
    `--qe-calibration`, `-qe-stats-file`, `--qe-dynamic`,
    `--qe-config-file`

fb98377e

Nov 11, 2019

Pruning with virtual Batch-norm statistics folding (#415) · c849a25f

Neta Zmora authored 5 years ago

* pruning: add an option to virtually fold BN into Conv2D for ranking

PruningPolicy can be configured using a new control argument fold_batchnorm: when set to `True`, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph).  Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so `fold_batchnorm` is relevant for fine-grained and filter-ranking pruning methods.  We attenuate using the running values of the mean and variance, as is done in quantization.
This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported).
e.g.:
policies:
  - pruner:
      instance_name : low_pruner
      args:
        fold_batchnorm: True
    starting_epoch: 0
    ending_epoch: 30
    frequency: 2

* AGP: non-functional refactoring

distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity`
to `_set_param_mask_by_sparsity_target`, which is a more appropriate function
name as we don’t really prune in this function

* Simplify GEMM weights input-channel ranking logic

Ranking weight-matrices by input channels is similar to ranking 4D
Conv weights by input channels, so there is no need for duplicate logic.

distiller/pruning/ranked_structures_pruner.py
-change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`,
which is a more appropriate function name as we don’t really prune in this
function
-remove the code handling ranking of matrix rows

distiller/norms.py – remove rank_cols.

distiller/thresholding.py – in expand_binary_map treat `channels` group_type
the same as the `cols` group_type when dealing with 2D weights

* AGP: add example of ranking filters with virtual BN-folding

Also update resnet20 AGP examples

c849a25f

Nov 10, 2019

early-exit: further refactoring and resnet50-imagenet · 795590c8

Neta Zmora authored 5 years ago

Refactor EE code and place in a separate file.
Fix resnet50-earlyexit (inputs of nn.Linear layers was wrong).

Caveats:
1. resnet50-earlyexit performance needs to be tested for performance.
2. there is still too much EE code dispersed in apputils/image_classifier.py
and compress_classifier.py

795590c8

tests/full_flow_tests.py: improve early-exit test robustness · 49a2a967

Neta Zmora authored 5 years ago

EE runs emit more statistics than the regular classification pipeline,
and it is more robust to validate more of the log output for correctness
validation.

49a2a967

Nov 08, 2019

Early-exit refactoring: flexible exits installation · a7473c95

Neta Zmora authored 5 years ago

Exits can now be attached to any point in the network
By specifying the name of the attachment node and
the exit-branch subgraph.

a7473c95

Nov 07, 2019
- Refactor ResNetCifarEarlyExit · 660a0da5
  Neta Zmora authored 5 years ago
  
  Step 1 of refactoring EE code in order to make it more generic.
  660a0da5
- Prefer .detach() or .data · bc00ee48
  Neta Zmora authored 5 years ago
  
  bc00ee48
- Fix Early-exit code · fc62caab
  Neta Zmora authored 5 years ago
  
  Fix the EE code so that it works with the current 'master' branch, and add a test for high-level EE regression
  fc62caab
- PTQ: Quant params generator - Don't yield anything when no stats · f4979de1
  Guy Jacob authored 5 years ago
  
  available
  f4979de1
Nov 06, 2019
- Bugfix: Deepcopy args when creating ClassifierCompressor (#421) · 78144d4c
  Guy Jacob authored 5 years ago
  
  Co-authored-by: Bar <29775567+barrh@users.noreply.github.com> Co-authored-by: Guy Jacob <guy.jacob@intel.com>
  Unverified
  
  78144d4c
- PTQ: Exposed get/set ops for post-train quantization params (#418) · 4478a73d
  Lev Zlotnik authored 5 years ago
  
  4478a73d
- update README · 245f8b0d
  Neta Zmora authored 5 years ago
  
  Remove monthly updates sections (Oct, Nov 2018)
  Unverified
  
  245f8b0d
Nov 05, 2019

Update internal ResNet implementation according to latest TorchVision (#414) · 1fe80da7

Guy Jacob authored 5 years ago

Changed our version to only re-implement BasicBlock and Bottleneck,
and duplicate the model creation functions. Other than that, re-use
everything from the torchvision implementation.

1fe80da7

Nov 03, 2019
- update README · 13887b62
  Neta Zmora authored 5 years ago
  
  add citation and links to published code
  Unverified
  
  13887b62
Oct 31, 2019
- Re-enable NLP quant notebooks after following #402 (#412) · fdb12eb1
  Guy Jacob authored 5 years ago
  
  * Add blacklist to quantizer. In PTQ put Dropout on the blacklist. * Update notebooks to use 2-phase stats collection * Other small fixes
  Unverified
  
  fdb12eb1
- distiller/utils.py: add param_name_2_module_name · 2058990a
  Neta Zmora authored 5 years ago
  
  - add function `param_name_2_module_name` to help convert from a module's .weight or .bias parameter tensor name, to a fully-qualified module name - remove dead code
  2058990a
- fix spelling in a comment · e12be712
  Neta Zmora authored 5 years ago
  
  e12be712
- apputils/image_classifier.py: fix log of best score · 7a035425
  Neta Zmora authored 5 years ago
  
  The number of nnz (non-zero) parameters in the model is printed as a negative number. Fix this issue and also change the label of this field in the log, to better reflect what this value means.
  7a035425
- Fix typo in yaml · cc04fa9f
  Guy Jacob authored 5 years ago
  
  cc04fa9f
- Fix typo in yaml · 93cd6d7d
  Guy Jacob authored 5 years ago
  
  93cd6d7d
Oct 30, 2019
- Correct handling of different batch-sizes in collectors (#410) · a689a4b4
  Lev Zlotnik authored 5 years ago
  
  * Also added workaround for stats collection on integer tensors instead of the previous solution of converting to Numpy
  a689a4b4
- Update README · 46c709c8
  Neta Zmora authored 5 years ago
  
  add citation
  Unverified
  
  46c709c8
Oct 28, 2019
- Fix issue #396 · 1a64c13b
  Neta Zmora authored 5 years ago
  
  Add to the notebook a missing function.
  1a64c13b
Oct 27, 2019
- Fix commit c2052b2b · 61dfc828
  Neta Zmora authored 5 years ago
  
  The error logs should only be emitted when there's an error...
  61dfc828
Oct 25, 2019

summary_graph.py: fix issue #360 · c2052b2b

Neta Zmora authored 5 years ago

Add try/except block around code accessing missing convolution
shape information.

c2052b2b

Oct 23, 2019

inspect_ckpt.py: support for very large models · 31c1bd89

Neta Zmora authored 5 years ago

Force loading on the CPU which always has more memory than a
single GPU.  This is useful for models that cannot be loaded onto
a single GPU.

31c1bd89

Update MobileNet v1 baseline training configuration file · 2c2a9417
Neta Zmora authored 5 years ago

2c2a9417

Fix AMC notebooks' sample commnand-line examples · 5059419b

Neta Zmora authored 5 years ago

As documented in issue #395, some of the command-line examples in the
AMC notebooks are incorrect.
Also, fix some bugs that were introduced with the refactoring of the
low-level pruning API

5059419b

Oct 22, 2019
- Set fixed_subset in data loader for image classifiers stats collection · b7eb1209
  Guy Jacob authored 5 years ago
  
  (since stats collection is now a 2-phase process)
  b7eb1209
- updated README · 503108e9
  Neta Zmora authored 5 years ago
  
  add citation
  Unverified
  
  503108e9
Oct 07, 2019

Post-Train Quant: Greedy Search + Proper mixed-settings handling (#402) · 9e7ef987

Guy Jacob authored 5 years ago


* Greedy search script for post-training quantization settings
  * Iterates over each layer in the model in order. For each layer,
    checks a user-defined set of quantization settings and chooses
    the best one based on validation accuracy
  * Provided sample that searches for best activations-clipping
    mode per layer, on image classification models

* Proper handling of mixed-quantization settings in post-train quant:
  * By default, the quantization settings for each layer apply only
    to output quantization
  * Propagate quantization settings for activations tensors through
    the model during execution
  * For non-quantized inputs to layers that require quantized inputs,
    fall-back to quantizing according to the settings used for the
    output
  * In addition, provide mechanism to override inputs quantization
    settings via the YAML configuration file
  * By default all modules are quantized now. For module types that
    don't have a dedicated quantized implementation, "fake"
    quantization is performed

* Misc. Changes
  * Fuse ReLU/ReLU6 to predecessor during post-training quantization
  * Fixes to ACIQ clipping in the half-range case

Co-authored-by: Lev Zlotnik <lev.zlotnik@intel.com>
Co-authored-by: Guy Jacob <guy.jacob@intel.com>

9e7ef987

AMC: fix the replay_buffer_size when using Coach and DDPG · 738d57f4
Neta Zmora authored 5 years ago

738d57f4

Remove confusing log message · 9f7f6b14

Neta Zmora authored 5 years ago

As noted in issue #382, logging when a parameter does not have
a mask is unnecessary and may confuse users.  Therefore, it is
removed.

9f7f6b14

Add logging of `app_cfg` to the logger default · bdab1fa5

Neta Zmora authored 5 years ago

`app_cfg` logs the basic execution environment state, and is deemed important
in most circumstances.

bdab1fa5

Oct 06, 2019

Low-level pruning API refactor (#401) · 05d5592e

Neta Zmora authored 5 years ago

Some refactoring of the low-level pruning API

Added distiller/norms.py - for calculating norms of various sub-tensors.

ranked_structures_pruner.py:
-Removed l1_magnitude, l2_magnitude. Use instead distiller.norms.l1_norm
-Lots of refactoring
-replaced LpRankedStructureParameterPruner.ch_binary_map_to_mask with
distiller.thresholding.expand_binary_map
-FMReconstructionChannelPruner.rank_and_prune_channels used L2-norm
by default and now uses L1-norm (i.e.magnitude_fn=l2_magnitude was
replaced with magnitude_fn=distiller.norms.l1_norm)

thresholding.py:
-Delegated lots of the work to the new norms.py.
-Removed support for 4D (entire convolution layers) since that has not been
maintained for a longtime. This may break some old scripts that remove entire
layers.
-added expand_binary_map() explicitly so others can use it. Might need to
move to a different file
-removed threshold_policy()

utils.py:
-use distiller.norms.xxx for sparsity stats

05d5592e

Change requirements on Tensorflow version (#400) · bbd6fef9

Bar authored 5 years ago

Hot-fix for issue that arises with FileWriter class on TF v2.
Allows only Tensorflow v1.X

bbd6fef9

Quantizer: Re-move model to device at the end of prepare_model · 2e0d147f
Guy Jacob authored 5 years ago

2e0d147f
Bugfix - remove unnecessary argument from NCF quantizer metadata · 475b6a67
Guy Jacob authored 5 years ago

475b6a67

Oct 05, 2019

QAT: Better handling of optimizer and of creation of fp32 weights copy (#399) · 4b6b5b19

Guy Jacob authored 5 years ago

* Create float copy such that the actual tensor being learned stays
  the same
* This way the optimizer doesn't have to be re-created, just need to
  add parameter groups if algo requires it (e.g. PACT)
* This also means we don't care about pre-existing parameter groups,
  as opposed to the previous implementation which ASSUMED a single
  existing group

4b6b5b19

Fix exception type being caught in checkpoint load sanity check · 3710c464
Guy Jacob authored 5 years ago

3710c464