- Nov 13, 2019
-
-
Bar authored
* Previous implementation: * Stats collection required a separate run with `-qe-calibration`. * Specifying `--quantize-eval` without `--qe-stats-file` triggered dynamic quantization. * Running with `--quantize-eval --qe-calibration <num>` only ran stats collection and ignored --quantize-eval. * New implementation: * Running `--quantize-eval --qe-calibration <num>` will now perform stats collection according to the calibration flag, and then quantize the model with the collected stats (and run evaluation). * Specifying `--quantize-eval` without `--qe-stats-file` will trigger the same flow as in the bullet above, as if `--qe-calibration 0.05` was used (i.e. 5% of the test set will be used for stats). * Added new flag: `--qe-dynamic`. From now, to do dynamic quantization, need to explicitly run: `--quantize-eval --qe-dynamic` * As before, can still run `--qe-calibration` without `--quantize-eval` to perform "stand-alone" stats collection * The following flags, which all represent different ways to control creation of stats or use of existing stats, are now mutually exclusive: `--qe-calibration`, `-qe-stats-file`, `--qe-dynamic`, `--qe-config-file`
-
- Nov 11, 2019
-
-
Neta Zmora authored
* pruning: add an option to virtually fold BN into Conv2D for ranking PruningPolicy can be configured using a new control argument fold_batchnorm: when set to `True`, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph). Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so `fold_batchnorm` is relevant for fine-grained and filter-ranking pruning methods. We attenuate using the running values of the mean and variance, as is done in quantization. This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported). e.g.: policies: - pruner: instance_name : low_pruner args: fold_batchnorm: True starting_epoch: 0 ending_epoch: 30 frequency: 2 * AGP: non-functional refactoring distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function * Simplify GEMM weights input-channel ranking logic Ranking weight-matrices by input channels is similar to ranking 4D Conv weights by input channels, so there is no need for duplicate logic. distiller/pruning/ranked_structures_pruner.py -change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function -remove the code handling ranking of matrix rows distiller/norms.py – remove rank_cols. distiller/thresholding.py – in expand_binary_map treat `channels` group_type the same as the `cols` group_type when dealing with 2D weights * AGP: add example of ranking filters with virtual BN-folding Also update resnet20 AGP examples
-
- Nov 10, 2019
-
-
Neta Zmora authored
Refactor EE code and place in a separate file. Fix resnet50-earlyexit (inputs of nn.Linear layers was wrong). Caveats: 1. resnet50-earlyexit performance needs to be tested for performance. 2. there is still too much EE code dispersed in apputils/image_classifier.py and compress_classifier.py
-
Neta Zmora authored
EE runs emit more statistics than the regular classification pipeline, and it is more robust to validate more of the log output for correctness validation.
-
- Nov 08, 2019
-
-
Neta Zmora authored
Exits can now be attached to any point in the network By specifying the name of the attachment node and the exit-branch subgraph.
-
- Nov 07, 2019
-
-
Neta Zmora authored
Step 1 of refactoring EE code in order to make it more generic.
-
Neta Zmora authored
-
Neta Zmora authored
Fix the EE code so that it works with the current 'master' branch, and add a test for high-level EE regression
-
Guy Jacob authored
available
-
- Nov 06, 2019
-
-
Guy Jacob authored
Co-authored-by:
Bar <29775567+barrh@users.noreply.github.com> Co-authored-by:
Guy Jacob <guy.jacob@intel.com>
-
Lev Zlotnik authored
-
Neta Zmora authored
Remove monthly updates sections (Oct, Nov 2018)
-
- Nov 05, 2019
-
-
Guy Jacob authored
Changed our version to only re-implement BasicBlock and Bottleneck, and duplicate the model creation functions. Other than that, re-use everything from the torchvision implementation.
-
- Nov 03, 2019
-
-
Neta Zmora authored
add citation and links to published code
-
- Oct 31, 2019
-
-
Guy Jacob authored
* Add blacklist to quantizer. In PTQ put Dropout on the blacklist. * Update notebooks to use 2-phase stats collection * Other small fixes
-
Neta Zmora authored
- add function `param_name_2_module_name` to help convert from a module's .weight or .bias parameter tensor name, to a fully-qualified module name - remove dead code
-
Neta Zmora authored
-
Neta Zmora authored
The number of nnz (non-zero) parameters in the model is printed as a negative number. Fix this issue and also change the label of this field in the log, to better reflect what this value means.
-
Guy Jacob authored
-
Guy Jacob authored
-
- Oct 30, 2019
-
-
Lev Zlotnik authored
* Also added workaround for stats collection on integer tensors instead of the previous solution of converting to Numpy
-
Neta Zmora authored
add citation
-
- Oct 28, 2019
-
-
Neta Zmora authored
Add to the notebook a missing function.
-
- Oct 27, 2019
-
-
Neta Zmora authored
The error logs should only be emitted when there's an error...
-
- Oct 25, 2019
-
-
Neta Zmora authored
Add try/except block around code accessing missing convolution shape information.
-
- Oct 23, 2019
-
-
Neta Zmora authored
Force loading on the CPU which always has more memory than a single GPU. This is useful for models that cannot be loaded onto a single GPU.
-
Neta Zmora authored
-
Neta Zmora authored
As documented in issue #395, some of the command-line examples in the AMC notebooks are incorrect. Also, fix some bugs that were introduced with the refactoring of the low-level pruning API
-
- Oct 22, 2019
-
-
Guy Jacob authored
(since stats collection is now a 2-phase process)
-
Neta Zmora authored
add citation
-
- Oct 07, 2019
-
-
Guy Jacob authored
* Greedy search script for post-training quantization settings * Iterates over each layer in the model in order. For each layer, checks a user-defined set of quantization settings and chooses the best one based on validation accuracy * Provided sample that searches for best activations-clipping mode per layer, on image classification models * Proper handling of mixed-quantization settings in post-train quant: * By default, the quantization settings for each layer apply only to output quantization * Propagate quantization settings for activations tensors through the model during execution * For non-quantized inputs to layers that require quantized inputs, fall-back to quantizing according to the settings used for the output * In addition, provide mechanism to override inputs quantization settings via the YAML configuration file * By default all modules are quantized now. For module types that don't have a dedicated quantized implementation, "fake" quantization is performed * Misc. Changes * Fuse ReLU/ReLU6 to predecessor during post-training quantization * Fixes to ACIQ clipping in the half-range case Co-authored-by:
Lev Zlotnik <lev.zlotnik@intel.com> Co-authored-by:
Guy Jacob <guy.jacob@intel.com>
-
Neta Zmora authored
-
Neta Zmora authored
As noted in issue #382, logging when a parameter does not have a mask is unnecessary and may confuse users. Therefore, it is removed.
-
Neta Zmora authored
`app_cfg` logs the basic execution environment state, and is deemed important in most circumstances.
-
- Oct 06, 2019
-
-
Neta Zmora authored
Some refactoring of the low-level pruning API Added distiller/norms.py - for calculating norms of various sub-tensors. ranked_structures_pruner.py: -Removed l1_magnitude, l2_magnitude. Use instead distiller.norms.l1_norm -Lots of refactoring -replaced LpRankedStructureParameterPruner.ch_binary_map_to_mask with distiller.thresholding.expand_binary_map -FMReconstructionChannelPruner.rank_and_prune_channels used L2-norm by default and now uses L1-norm (i.e.magnitude_fn=l2_magnitude was replaced with magnitude_fn=distiller.norms.l1_norm) thresholding.py: -Delegated lots of the work to the new norms.py. -Removed support for 4D (entire convolution layers) since that has not been maintained for a longtime. This may break some old scripts that remove entire layers. -added expand_binary_map() explicitly so others can use it. Might need to move to a different file -removed threshold_policy() utils.py: -use distiller.norms.xxx for sparsity stats
-
Bar authored
Hot-fix for issue that arises with FileWriter class on TF v2. Allows only Tensorflow v1.X
-
Guy Jacob authored
-
Guy Jacob authored
-
- Oct 05, 2019
-
-
Guy Jacob authored
* Create float copy such that the actual tensor being learned stays the same * This way the optimizer doesn't have to be re-created, just need to add parameter groups if algo requires it (e.g. PACT) * This also means we don't care about pre-existing parameter groups, as opposed to the previous implementation which ASSUMED a single existing group
-
Guy Jacob authored
-