- Nov 28, 2019
-
-
Neta Zmora authored
- define ALMOST_ONE - define op_type - remove sanity assert (need to understand what tolerance value to use in the assert) Co-authored-by: csc12138 Co-authored-by: wangyidong3
-
- Nov 17, 2019
-
-
Neta Zmora authored
-
- Nov 16, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
-
- Nov 13, 2019
-
-
Bar authored
* Previous implementation: * Stats collection required a separate run with `-qe-calibration`. * Specifying `--quantize-eval` without `--qe-stats-file` triggered dynamic quantization. * Running with `--quantize-eval --qe-calibration <num>` only ran stats collection and ignored --quantize-eval. * New implementation: * Running `--quantize-eval --qe-calibration <num>` will now perform stats collection according to the calibration flag, and then quantize the model with the collected stats (and run evaluation). * Specifying `--quantize-eval` without `--qe-stats-file` will trigger the same flow as in the bullet above, as if `--qe-calibration 0.05` was used (i.e. 5% of the test set will be used for stats). * Added new flag: `--qe-dynamic`. From now, to do dynamic quantization, need to explicitly run: `--quantize-eval --qe-dynamic` * As before, can still run `--qe-calibration` without `--quantize-eval` to perform "stand-alone" stats collection * The following flags, which all represent different ways to control creation of stats or use of existing stats, are now mutually exclusive: `--qe-calibration`, `-qe-stats-file`, `--qe-dynamic`, `--qe-config-file`
-
- Nov 11, 2019
-
-
Neta Zmora authored
* pruning: add an option to virtually fold BN into Conv2D for ranking PruningPolicy can be configured using a new control argument fold_batchnorm: when set to `True`, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph). Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so `fold_batchnorm` is relevant for fine-grained and filter-ranking pruning methods. We attenuate using the running values of the mean and variance, as is done in quantization. This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported). e.g.: policies: - pruner: instance_name : low_pruner args: fold_batchnorm: True starting_epoch: 0 ending_epoch: 30 frequency: 2 * AGP: non-functional refactoring distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function * Simplify GEMM weights input-channel ranking logic Ranking weight-matrices by input channels is similar to ranking 4D Conv weights by input channels, so there is no need for duplicate logic. distiller/pruning/ranked_structures_pruner.py -change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function -remove the code handling ranking of matrix rows distiller/norms.py – remove rank_cols. distiller/thresholding.py – in expand_binary_map treat `channels` group_type the same as the `cols` group_type when dealing with 2D weights * AGP: add example of ranking filters with virtual BN-folding Also update resnet20 AGP examples
-
- Nov 07, 2019
-
-
Neta Zmora authored
Fix the EE code so that it works with the current 'master' branch, and add a test for high-level EE regression
-
- Nov 06, 2019
-
-
Guy Jacob authored
Co-authored-by:
Bar <29775567+barrh@users.noreply.github.com> Co-authored-by:
Guy Jacob <guy.jacob@intel.com>
-
- Oct 31, 2019
- Oct 23, 2019
-
-
Neta Zmora authored
Force loading on the CPU which always has more memory than a single GPU. This is useful for models that cannot be loaded onto a single GPU.
-
Neta Zmora authored
-
Neta Zmora authored
As documented in issue #395, some of the command-line examples in the AMC notebooks are incorrect. Also, fix some bugs that were introduced with the refactoring of the low-level pruning API
-
- Oct 07, 2019
-
-
Guy Jacob authored
* Greedy search script for post-training quantization settings * Iterates over each layer in the model in order. For each layer, checks a user-defined set of quantization settings and chooses the best one based on validation accuracy * Provided sample that searches for best activations-clipping mode per layer, on image classification models * Proper handling of mixed-quantization settings in post-train quant: * By default, the quantization settings for each layer apply only to output quantization * Propagate quantization settings for activations tensors through the model during execution * For non-quantized inputs to layers that require quantized inputs, fall-back to quantizing according to the settings used for the output * In addition, provide mechanism to override inputs quantization settings via the YAML configuration file * By default all modules are quantized now. For module types that don't have a dedicated quantized implementation, "fake" quantization is performed * Misc. Changes * Fuse ReLU/ReLU6 to predecessor during post-training quantization * Fixes to ACIQ clipping in the half-range case Co-authored-by:
Lev Zlotnik <lev.zlotnik@intel.com> Co-authored-by:
Guy Jacob <guy.jacob@intel.com>
-
Neta Zmora authored
-
- Oct 06, 2019
-
-
Neta Zmora authored
Some refactoring of the low-level pruning API Added distiller/norms.py - for calculating norms of various sub-tensors. ranked_structures_pruner.py: -Removed l1_magnitude, l2_magnitude. Use instead distiller.norms.l1_norm -Lots of refactoring -replaced LpRankedStructureParameterPruner.ch_binary_map_to_mask with distiller.thresholding.expand_binary_map -FMReconstructionChannelPruner.rank_and_prune_channels used L2-norm by default and now uses L1-norm (i.e.magnitude_fn=l2_magnitude was replaced with magnitude_fn=distiller.norms.l1_norm) thresholding.py: -Delegated lots of the work to the new norms.py. -Removed support for 4D (entire convolution layers) since that has not been maintained for a longtime. This may break some old scripts that remove entire layers. -added expand_binary_map() explicitly so others can use it. Might need to move to a different file -removed threshold_policy() utils.py: -use distiller.norms.xxx for sparsity stats
-
- Sep 27, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
Move these files to their true location, instead of using soft-links. Also added a short README file to distiller/examples/baseline_networks directory.
-
- Sep 24, 2019
-
-
Guy Jacob authored
* And removed unnecessary argument from execution env logging function
-
- Sep 18, 2019
-
-
Guy Jacob authored
-
Neta Zmora authored
-
Neta Zmora authored
A bundle of very small, and mostly non-functional, changes to the code. Mostly they are unrelated to each other ../../../distiller/apputils/checkpoint.py – add info to exception ../../../distiller/apputils/image_classifier.py – remove unused `--extras` command-line argument ../../../distiller/thinning.py – code refactoring (non-functional) except for adding a new public API: contract_model() ../../classifier_compression/compress_classifier.py – use contract_model() when using `--thinnify` ../../lottery_ticket/README.md – remove illegal characters in the text
-
- Sep 10, 2019
-
-
Yury Nahshan authored
ACIQ clipping method, as described in: Post training 4-bit quantization of convolutional networks for rapid-deployment (Ron Banner , Yury Nahshan, Daniel Soudry) (NeurIPS 2019) https://arxiv.org/abs/1810.05723 Co-authored-by:
Yury Nahshan <yury.nahshan@intel.com> Co-authored-by:
Lev Zlotnik <lev.zlotnik@intel.com>
-
- Sep 06, 2019
-
-
Neta Zmora authored
Integrate the code for the DDPG agent from: https://github.com/mit-han-lab/amc-release The instructions for cloning HAN's code and then making changes to fit Distiller were too complicated, so we added the integrated files to distiller/examples/auto_compression/amc/rl_lib/hanlab
-
- Sep 02, 2019
-
-
Neta Zmora authored
Mainly: moved NetworkWrapper to a separate file.
-
- Sep 01, 2019
-
-
Neta Zmora authored
FMReconstructionChannelPruner: add support for nn.Linear layers utils.py: add non_zero_channels() thinning: support removing channels from FC layers preceding Conv layers test_pruning.py: add test_row_pruning() scheduler: init from a dictionary of Maskers coach_if.py – fix imports of Clipped-PPO and TD3
-
- Aug 28, 2019
-
-
Neta Zmora authored
This command command-line argument allows us to save the randomly-initialized model before training (useful for lottery-ticket method). This commit was accidentally left out of the Lottery-Ticket Hypothesis commit from Aug 26.
-
- Aug 26, 2019
-
-
Neta Zmora authored
Added support for saving the randomly initialized network before starting training; and added an implmentation showing how to extract a (winning) lottery ticket from the prestine network, and the pruned network.
-
- Aug 13, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
-
- Aug 11, 2019
-
-
Neta Zmora authored
When using flag `-s` which prints the compression scheduler pruning mask keys, we also print a table with the fine-grain sparsity of each mask.
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
-
- Aug 08, 2019