Commits · 47af2cfa23fc4df74b2ca44bda1c144e1d6717db · llvm / distiller

Nov 28, 2019

AMC: fix problems reported in issue #429 · 47af2cfa

Neta Zmora authored 5 years ago

- define ALMOST_ONE
- define op_type
- remove sanity assert (need to understand what tolerance value to use
in the assert)

Co-authored-by: csc12138
Co-authored-by: wangyidong3

47af2cfa

Nov 17, 2019
- Add README.md files for some pruning examples · 0c175c94
  Neta Zmora authored 5 years ago
  
  0c175c94
Nov 16, 2019
- Add README.md files for APG and DropFilter · 70e26735
  Neta Zmora authored 5 years ago
  
  70e26735
- Remove duplicate YAML file · 49933144
  Neta Zmora authored 5 years ago
  
  49933144
Nov 13, 2019

image_classifier.py: PTQ stats collection and eval in same run (#346) · fb98377e

Bar authored 5 years ago

* Previous implementation:
  * Stats collection required a separate run with `-qe-calibration`.
  * Specifying `--quantize-eval` without `--qe-stats-file` triggered
    dynamic quantization.
  * Running with `--quantize-eval --qe-calibration <num>` only ran
    stats collection and ignored --quantize-eval.

* New implementation:
  * Running `--quantize-eval --qe-calibration <num>` will now 
    perform stats collection according to the calibration flag,
    and then quantize the model with the collected stats (and
    run evaluation).
  * Specifying `--quantize-eval` without `--qe-stats-file` will
    trigger the same flow as in the bullet above, as if 
    `--qe-calibration 0.05` was used (i.e. 5% of the test set will
    be used for stats).
  * Added new flag: `--qe-dynamic`. From now, to do dynamic 
    quantization, need to explicitly run:
    `--quantize-eval --qe-dynamic`
  * As before, can still run `--qe-calibration` without 
    `--quantize-eval` to perform "stand-alone" stats collection
  * The following flags, which all represent different ways to
    control creation of stats or use of existing stats, are now
    mutually exclusive:
    `--qe-calibration`, `-qe-stats-file`, `--qe-dynamic`,
    `--qe-config-file`

fb98377e

Nov 11, 2019

Pruning with virtual Batch-norm statistics folding (#415) · c849a25f

Neta Zmora authored 5 years ago

* pruning: add an option to virtually fold BN into Conv2D for ranking

PruningPolicy can be configured using a new control argument fold_batchnorm: when set to `True`, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph).  Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so `fold_batchnorm` is relevant for fine-grained and filter-ranking pruning methods.  We attenuate using the running values of the mean and variance, as is done in quantization.
This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported).
e.g.:
policies:
  - pruner:
      instance_name : low_pruner
      args:
        fold_batchnorm: True
    starting_epoch: 0
    ending_epoch: 30
    frequency: 2

* AGP: non-functional refactoring

distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity`
to `_set_param_mask_by_sparsity_target`, which is a more appropriate function
name as we don’t really prune in this function

* Simplify GEMM weights input-channel ranking logic

Ranking weight-matrices by input channels is similar to ranking 4D
Conv weights by input channels, so there is no need for duplicate logic.

distiller/pruning/ranked_structures_pruner.py
-change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`,
which is a more appropriate function name as we don’t really prune in this
function
-remove the code handling ranking of matrix rows

distiller/norms.py – remove rank_cols.

distiller/thresholding.py – in expand_binary_map treat `channels` group_type
the same as the `cols` group_type when dealing with 2D weights

* AGP: add example of ranking filters with virtual BN-folding

Also update resnet20 AGP examples

Unverified

c849a25f

Nov 07, 2019

Fix Early-exit code · fc62caab

Neta Zmora authored 5 years ago

Fix the EE code so that it works with the current 'master' branch,
and add a test for high-level EE regression

fc62caab

Nov 06, 2019
- Bugfix: Deepcopy args when creating ClassifierCompressor (#421) · 78144d4c
  Guy Jacob authored 5 years ago
  
  Co-authored-by: Bar <29775567+barrh@users.noreply.github.com> Co-authored-by: Guy Jacob <guy.jacob@intel.com>
  Unverified
  
  78144d4c
Oct 31, 2019
- Re-enable NLP quant notebooks after following #402 (#412) · fdb12eb1
  Guy Jacob authored 5 years ago
  
  * Add blacklist to quantizer. In PTQ put Dropout on the blacklist. * Update notebooks to use 2-phase stats collection * Other small fixes
  Unverified
  
  fdb12eb1
- Fix typo in yaml · cc04fa9f
  Guy Jacob authored 5 years ago
  
  cc04fa9f
- Fix typo in yaml · 93cd6d7d
  Guy Jacob authored 5 years ago
  
  93cd6d7d
Oct 23, 2019

inspect_ckpt.py: support for very large models · 31c1bd89

Neta Zmora authored 5 years ago

Force loading on the CPU which always has more memory than a
single GPU.  This is useful for models that cannot be loaded onto
a single GPU.

31c1bd89

Update MobileNet v1 baseline training configuration file · 2c2a9417
Neta Zmora authored 5 years ago

2c2a9417

Fix AMC notebooks' sample commnand-line examples · 5059419b

Neta Zmora authored 5 years ago

As documented in issue #395, some of the command-line examples in the
AMC notebooks are incorrect.
Also, fix some bugs that were introduced with the refactoring of the
low-level pruning API

5059419b

Oct 07, 2019

Post-Train Quant: Greedy Search + Proper mixed-settings handling (#402) · 9e7ef987

Guy Jacob authored 5 years ago


* Greedy search script for post-training quantization settings
  * Iterates over each layer in the model in order. For each layer,
    checks a user-defined set of quantization settings and chooses
    the best one based on validation accuracy
  * Provided sample that searches for best activations-clipping
    mode per layer, on image classification models

* Proper handling of mixed-quantization settings in post-train quant:
  * By default, the quantization settings for each layer apply only
    to output quantization
  * Propagate quantization settings for activations tensors through
    the model during execution
  * For non-quantized inputs to layers that require quantized inputs,
    fall-back to quantizing according to the settings used for the
    output
  * In addition, provide mechanism to override inputs quantization
    settings via the YAML configuration file
  * By default all modules are quantized now. For module types that
    don't have a dedicated quantized implementation, "fake"
    quantization is performed

* Misc. Changes
  * Fuse ReLU/ReLU6 to predecessor during post-training quantization
  * Fixes to ACIQ clipping in the half-range case

Co-authored-by: Lev Zlotnik <lev.zlotnik@intel.com>
Co-authored-by: Guy Jacob <guy.jacob@intel.com>

Unverified

9e7ef987

AMC: fix the replay_buffer_size when using Coach and DDPG · 738d57f4
Neta Zmora authored 5 years ago

738d57f4

Oct 06, 2019

Low-level pruning API refactor (#401) · 05d5592e

Neta Zmora authored 5 years ago

Some refactoring of the low-level pruning API

Added distiller/norms.py - for calculating norms of various sub-tensors.

ranked_structures_pruner.py:
-Removed l1_magnitude, l2_magnitude. Use instead distiller.norms.l1_norm
-Lots of refactoring
-replaced LpRankedStructureParameterPruner.ch_binary_map_to_mask with
distiller.thresholding.expand_binary_map
-FMReconstructionChannelPruner.rank_and_prune_channels used L2-norm
by default and now uses L1-norm (i.e.magnitude_fn=l2_magnitude was
replaced with magnitude_fn=distiller.norms.l1_norm)

thresholding.py:
-Delegated lots of the work to the new norms.py.
-Removed support for 4D (entire convolution layers) since that has not been
maintained for a longtime. This may break some old scripts that remove entire
layers.
-added expand_binary_map() explicitly so others can use it. Might need to
move to a different file
-removed threshold_policy()

utils.py:
-use distiller.norms.xxx for sparsity stats

Unverified

05d5592e

Sep 27, 2019
- agp-pruning/resnet50.schedule_agp.filters.yaml - fix cmd-line · 2dcf3ff3
  Neta Zmora authored 5 years ago
  
  2dcf3ff3
- update examples/baseline_networks/README.md · 0cfcf0d7
  Neta Zmora authored 5 years ago
  
  Unverified
  
  0cfcf0d7
- Move plain20 and vgg16 baseline training scripts · 8ff74211
  Neta Zmora authored 5 years ago
  
  Move these files to their true location, instead of using soft-links. Also added a short README file to distiller/examples/baseline_networks directory.
  8ff74211
Sep 24, 2019
- Fix bug in logger init from #391 which broke image classification scripts · 7bc2890e
  Guy Jacob authored 5 years ago
  
  * And removed unnecessary argument from execution env logging function
  7bc2890e
Sep 18, 2019

NCF: Add missing script to get dataset and update README · 9097d6d7
Guy Jacob authored 5 years ago

9097d6d7
compress_classifier.py: remove remarked code · f6c48f87
Neta Zmora authored 5 years ago

f6c48f87

Odds and ends commit · 8d55ab15

Neta Zmora authored 5 years ago

A bundle of very small, and mostly non-functional, changes to the code.
Mostly they are unrelated to each other
../../../distiller/apputils/checkpoint.py – add info to exception

../../../distiller/apputils/image_classifier.py – remove unused
`--extras` command-line argument

../../../distiller/thinning.py – code refactoring (non-functional)
except for adding a new public API: contract_model()

../../classifier_compression/compress_classifier.py – use
contract_model() when using `--thinnify`

../../lottery_ticket/README.md – remove illegal characters in
the text

8d55ab15

Sep 10, 2019

ACIQ clipping in post-training quantization (#173) · 534072d8

Yury Nahshan authored 5 years ago

ACIQ clipping method, as described in:

Post training 4-bit quantization of convolutional networks for rapid-deployment
(Ron Banner , Yury Nahshan, Daniel Soudry)
(NeurIPS 2019)

https://arxiv.org/abs/1810.05723



Co-authored-by: Yury Nahshan <yury.nahshan@intel.com>
Co-authored-by: Lev Zlotnik <lev.zlotnik@intel.com>

534072d8

Sep 06, 2019

AMC: add direct integration of HAN Lab's DDPG agent · c4a17d6f

Neta Zmora authored 5 years ago

Integrate the code for the DDPG agent from:
https://github.com/mit-han-lab/amc-release

The instructions for cloning HAN's code and then making changes
to fit Distiller were too complicated, so we added the integrated
files to distiller/examples/auto_compression/amc/rl_lib/hanlab

c4a17d6f

Sep 02, 2019
- AMC: non-functional refactoring · f89ba961
  Neta Zmora authored 5 years ago
  
  Mainly: moved NetworkWrapper to a separate file.
  f89ba961
Sep 01, 2019

AMC: add pruning of FC layers · 3f7a9408

Neta Zmora authored 5 years ago

FMReconstructionChannelPruner: add support for nn.Linear layers
utils.py: add non_zero_channels()
thinning: support removing channels from FC layers preceding Conv layers
test_pruning.py: add test_row_pruning()
scheduler: init from a dictionary of Maskers
coach_if.py – fix imports of Clipped-PPO and TD3

3f7a9408

Aug 28, 2019

compress_classifier.py: added handling of --save-untrained-model · e298c027

Neta Zmora authored 5 years ago

This command command-line argument allows us to save the
randomly-initialized model before training (useful for lottery-ticket
method).
This commit was accidentally left out of the Lottery-Ticket Hypothesis
commit from Aug 26.

e298c027

Aug 26, 2019

Lottery Ticket Hypothesis · 78e2e4c7

Neta Zmora authored 5 years ago

Added support for saving the randomly initialized network before
starting training; and added an implmentation showing how to extract
a (winning) lottery ticket from the prestine network, and the
pruned network.

78e2e4c7

Aug 13, 2019
- AMC: environment.py - factor initiaization code · e6625e4a
  Neta Zmora authored 5 years ago
  
  e6625e4a
- AMC: rewards.py - shorten log statement · 7d712fab
  Neta Zmora authored 5 years ago
  
  7d712fab
- AMC: small fixes · 54e72012
  Neta Zmora authored 5 years ago
  
  54e72012
Aug 11, 2019
- classifier_compression/inspect_ckpt.py: print mask sparsities · ba11d307
  Neta Zmora authored 5 years ago
  
  When using flag `-s` which prints the compression scheduler pruning mask keys, we also print a table with the fine-grain sparsity of each mask.
  ba11d307
- AMC args: cleanup add_automl_args arguments, fix documentation · a229743d
  Neta Zmora authored 5 years ago
  
  a229743d
- AMC notebooks: small fixes · 2f520c39
  Neta Zmora authored 5 years ago
  
  2f520c39
- environment.py: enforce processing of known networks only · 6b21a264
  Neta Zmora authored 5 years ago
  
  6b21a264
Aug 08, 2019
- Remove unnecessary script + NCF QAT scheduler (for now) · 8ca422db
  Guy Jacob authored 5 years ago
  
  8ca422db
- Remove debug comments · ca96658d
  Guy Jacob authored 5 years ago
  
  Unverified
  
  ca96658d
- Update NCF README.md · 6b39f1fa
  Guy Jacob authored 5 years ago
  
  Unverified
  
  6b39f1fa