- Dec 09, 2019
-
-
Lev Zlotnik authored
Added tables of results for 85% sparsity
-
- Dec 08, 2019
-
-
Guy Jacob authored
* Weights-only PTQ: * Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in which case it'll act as a simple pass-through during forward * In RangeLinearQuantParamLayerWrapper, if bits_activations is None and num_bits_params > 0, Perform quant and de-quant of the parameters instead of just quant. * Activations-only PTQ: * Enable activations only quantization for conv/linear modules. When PostTrainLinearQuantizer detects # bits != None for activations and # bits == None for weights, a fake-quantization wrapper will be used. * Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command line arguments to invoke weights/activations-only quantization, respectively. * Minor refactoring for clarity in PostTrainLinearQuantizer's replace_* functions
-
-
- Dec 03, 2019
-
-
SunYiran authored
-
- Dec 02, 2019
-
-
Neta Zmora authored
compute-summary and png-summary currently work with image classifiers only.
-
Neta Zmora authored
When multi-processing, we want only one process to generate the summary, while the other processes do nothing (lazy bums!)
-
levzlotnik authored
-
Lev Zlotnik authored
Add an example of compressing OD pytorch models. In this example we compress torchvision's object detection models - FasterRCNN / MaskRCNN / KeypointRCNN. We've modified the reference code for object detection to allow easy compression scheduling with YAML configuration.
-
levzlotnik authored
of a model by name relative to the root of the model.
-
- Nov 28, 2019
-
-
Neta Zmora authored
- define ALMOST_ONE - define op_type - remove sanity assert (need to understand what tolerance value to use in the assert) Co-authored-by: csc12138 Co-authored-by: wangyidong3
-
Neta Zmora authored
- define ALMOST_ONE - define op_type - remove sanity assert (need to understand what tolerance value to use in the assert)
- Nov 27, 2019
-
-
Neta Zmora authored
This will help define and use different performance sorting schemes. E.g. this will address the issue raised in issue #411
-
Neta Zmora authored
Small variances can occur when using different cudnn versions, even when the environment and distiller version is the same.
-
Neta Zmora authored
Said commit was wrong: the default inializations in pytorch are not the same as in our code. For example, the default convolution weight initialization uses Kaiming-uniform, while we used Kaiming-normal. For backward comparability of the model behavior, we need to revert to the old behavior. This reverts commit 6913687f.
-
- Nov 25, 2019
-
-
Neta Zmora authored
Update the definition of the exits using info from Haim. This is still very unsatsifactory because we don't have working examples to show users :-(
-
- Nov 17, 2019
-
-
Neta Zmora authored
-
- Nov 16, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
except for the case of VGG, our parameter initialization code was matched the default pytorch initialization (per torch.nn operation), so writing the initialization code ourselves can only lead to more code and maintenance; and also we would not benefit from improvements that occur at the pytorch level (e.g. if FB finds a better initialization for nn.conv2d than today's kaiming init, we would not benefit). The VGG initialization we had was "suspicious" and so reverting to the default seems reasonable.
-
- Nov 14, 2019
-
-
Guy Jacob authored
-
Guy Jacob authored
* summary_graph.py: * Change ONNX op.uniqueName() to op.debugName() * Removed scope-naming workaround which isn't needed in PyTorch 1.3 * Tests: * Naming of trace entries changed in 1.3. Fixed SummaryGraph unit test that checked that * Adjusted expected values in full_flow_tests * Adjusted tolerance in test_sim_bn_fold * Filter some new warnings
-
- Nov 13, 2019
-
-
Neta Zmora authored
Prevent exception when loading checkpoints from a home directory
-
Neta Zmora authored
Two strings represented the library version: one in distiller.__init__.py and one in setup.py. This can lead to two difference version string values. The fix: have distiller.__init__.py read the version string from the package installation. This assumes that we've installed distiller properly, but we've been making this assumption for a long time in our code (e.g. how we do imports of distiller from the `tests` directory).
-
Neta Zmora authored
When performing EE validation, the validation loop prints an annoying and redundant log of the iteration number - remove this.
-
Guy Jacob authored
-
Bar authored
* Previous implementation: * Stats collection required a separate run with `-qe-calibration`. * Specifying `--quantize-eval` without `--qe-stats-file` triggered dynamic quantization. * Running with `--quantize-eval --qe-calibration <num>` only ran stats collection and ignored --quantize-eval. * New implementation: * Running `--quantize-eval --qe-calibration <num>` will now perform stats collection according to the calibration flag, and then quantize the model with the collected stats (and run evaluation). * Specifying `--quantize-eval` without `--qe-stats-file` will trigger the same flow as in the bullet above, as if `--qe-calibration 0.05` was used (i.e. 5% of the test set will be used for stats). * Added new flag: `--qe-dynamic`. From now, to do dynamic quantization, need to explicitly run: `--quantize-eval --qe-dynamic` * As before, can still run `--qe-calibration` without `--quantize-eval` to perform "stand-alone" stats collection * The following flags, which all represent different ways to control creation of stats or use of existing stats, are now mutually exclusive: `--qe-calibration`, `-qe-stats-file`, `--qe-dynamic`, `--qe-config-file`
-
- Nov 11, 2019
-
-
Neta Zmora authored
* pruning: add an option to virtually fold BN into Conv2D for ranking PruningPolicy can be configured using a new control argument fold_batchnorm: when set to `True`, the weights of BatchNorm modules are folded into the weights of Conv-2D modules (if Conv2D->BN edges exist in the model graph). Each weights filter is attenuated using a different pair of (gamma, beta) coefficients, so `fold_batchnorm` is relevant for fine-grained and filter-ranking pruning methods. We attenuate using the running values of the mean and variance, as is done in quantization. This control argument is only supported for Conv-2D modules (i.e. other convolution operation variants and Linear operations are not supported). e.g.: policies: - pruner: instance_name : low_pruner args: fold_batchnorm: True starting_epoch: 0 ending_epoch: 30 frequency: 2 * AGP: non-functional refactoring distiller/pruning/automated_gradual_pruner.py – change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function * Simplify GEMM weights input-channel ranking logic Ranking weight-matrices by input channels is similar to ranking 4D Conv weights by input channels, so there is no need for duplicate logic. distiller/pruning/ranked_structures_pruner.py -change `prune_to_target_sparsity` to `_set_param_mask_by_sparsity_target`, which is a more appropriate function name as we don’t really prune in this function -remove the code handling ranking of matrix rows distiller/norms.py – remove rank_cols. distiller/thresholding.py – in expand_binary_map treat `channels` group_type the same as the `cols` group_type when dealing with 2D weights * AGP: add example of ranking filters with virtual BN-folding Also update resnet20 AGP examples
-
- Nov 10, 2019
-
-
Neta Zmora authored
Refactor EE code and place in a separate file. Fix resnet50-earlyexit (inputs of nn.Linear layers was wrong). Caveats: 1. resnet50-earlyexit performance needs to be tested for performance. 2. there is still too much EE code dispersed in apputils/image_classifier.py and compress_classifier.py
-
Neta Zmora authored
EE runs emit more statistics than the regular classification pipeline, and it is more robust to validate more of the log output for correctness validation.
-
- Nov 08, 2019
-
-
Neta Zmora authored
Exits can now be attached to any point in the network By specifying the name of the attachment node and the exit-branch subgraph.
-
- Nov 07, 2019
-
-
Neta Zmora authored
Step 1 of refactoring EE code in order to make it more generic.
-
Neta Zmora authored
-
Neta Zmora authored
Fix the EE code so that it works with the current 'master' branch, and add a test for high-level EE regression
-
Guy Jacob authored
available
-
- Nov 06, 2019
-
-
Guy Jacob authored
Co-authored-by:
Bar <29775567+barrh@users.noreply.github.com> Co-authored-by:
Guy Jacob <guy.jacob@intel.com>
-
Lev Zlotnik authored
-
Neta Zmora authored
Remove monthly updates sections (Oct, Nov 2018)
-
- Nov 05, 2019
-
-
Guy Jacob authored
Changed our version to only re-implement BasicBlock and Bottleneck, and duplicate the model creation functions. Other than that, re-use everything from the torchvision implementation.
-
- Nov 03, 2019
-
-
Neta Zmora authored
add citation and links to published code
-