- Apr 08, 2019
-
-
Neta Zmora authored
Add finer control over the pruning logic, to accommodate more pruning use-cases. The full description of the new logic is available in the updated [documentation of the CompressionScheduler](https://nervanasystems.github.io/distiller/schedule.html#pruning-fine-control), which is also part of this PR. In this PR: * Added a new callback to the CompressionScheduler: compression_scheduler.before_parameter_optimization which is invoked after the gradients are are computed, but before the weights are updated by the optimizer. * We provide an option to mask the gradients, before the weights are updated by the optimizer. We register to the parameter backward hook in order to mask the gradients. This gives us finer control over the parameter updates. * Added several DropFilter schedules. DropFilter is a method to regularize networks, and it can also be used to "prepare" a network for permanent filter pruning. *Added documentation of pruning fine-control
-
- Apr 01, 2019
-
-
Lev Zlotnik authored
* Bias handling: * Add 'bits_bias' parameter to explicitly specify # of bits for bias, similar to weights and activations. * BREAKING: Remove the now redundant 'quantize_bias' boolean parameter * Custom overrides: * Expand the semantics of the overrides dict to allow overriding of other parameters in addition to bit-widths * Functions registered in the quantizer's 'replacement_factory' can define keyword arguments. Non bit-width entries in the overrides dict will be checked against the function signature and passed * BREAKING: * Changed the name of 'bits_overrides' to simply 'overrides' * Bit-width overrides must now be defined using the full parameter names - 'bits_activations/weights/bias' instead of the short-hands 'acts' and 'wts' which were used so far. * Added/updated relevant tests * Modified all quantization YAMLs under 'examples' to reflect these changes * Updated docs
-
- Mar 29, 2019
-
-
Songyi Blair Han authored
-
- Feb 26, 2019
-
-
Lev Zlotnik authored
Not backward compatible - re-installation is required * Fixes for PyTorch==1.0.0 * Refactoring folder structure * Update installation section in docs
-
- Feb 11, 2019
-
-
Guy Jacob authored
Summary of changes: (1) Post-train quantization based on pre-collected statistics (2) Quantized concat, element-wise addition / multiplication and embeddings (3) Move post-train quantization command line args out of sample code (4) Configure post-train quantization from YAML for more fine-grained control (See PR #136 for more detailed changes descriptions)
-
- Dec 11, 2018
- Dec 09, 2018
-
-
103yiran authored
-
- Dec 06, 2018
-
-
Neta Zmora authored
- Moved the Language model and struct pruning tutorials from the Wiki to the HTML documentation. Love the ease of Wiki, but GitHub doesn't let Google crawl these pages, and users can't open PRs on Wiki pages. - Updated the pruning algorithms documentation
-
- Dec 04, 2018
-
-
Guy Jacob authored
* Asymmetric post-training quantization (only symmetric supported so until now) * Quantization aware training for range-based (min-max) symmetric and asymmetric quantization * Per-channel quantization support in both training and post-training * Added tests and examples * Updated documentation
-
- Nov 25, 2018
-
-
Neta Zmora authored
-
- Nov 24, 2018
-
-
Neta Zmora authored
Thanks to Dan Alistarh for bringing this issue to my attention. The activations of Linear layers have shape (batch_size, output_size) and those of Convolution layers have shape (batch_size, num_channels, width, height) and this distinction in shape was not correctly handled. This commit also fixes sparsity computation for very large activations, as seen in VGG16, which leads to memory exhaustion. One solution is to use smaller batch sizes, but this commit uses a different solution, which counts zeros “manually”, and using less space. Also in this commit: - Added a “caveats” section to the documentation. - Added more tests.
-
- Nov 21, 2018
-
-
Neta Zmora authored
-
Neta Zmora authored
Add docs/conditional_computation.md which was accidentally left out of an earlier commit.
-
- Nov 08, 2018
-
-
Haim Barad authored
* Updated stats computation - fixes issues with validation stats * Clarification of output (docs) * Update * Moved validation stats to separate function
-
- Nov 07, 2018
-
-
Neta Zmora authored
-
- Nov 06, 2018
-
-
Haim Barad authored
* Fixed validation stats and added new summary stats * Trimmed some comments. * Improved figure for documentation * Minor updates
-
- Nov 04, 2018
-
-
Neta Zmora authored
-
- Oct 03, 2018
-
-
Neta Zmora authored
Latest versions of Jupyter notebooks have a different syntax for launching the server such that it listens on oll network interfaces (this is useful if you are running the Jupyter server on one machine, and connect to it from a browser on a different machine). So: jupyter-notebook --ip=* --no-browser is replaced by: jupyter-notebook --ip=0.0.0.0 --no-browser
-
- Sep 16, 2018
-
-
Neta Zmora authored
* A temporary fix for issue 36 The thinning code assumes that the sgraph it is using is not data-parallel, because it (currently) accesses the layer-name keys using a "normalized" name ("module." is removed). The bug is that in thinning.py#L73 we create a data_parallel=True model; and then give it to sgraph. But in other places thinning code uses "normalized" keys. For example in thinning.py#L264. The temporary fix configures data_parallel=False in thinning.py#L73. A long term solution should have SummaryGraph know how to handle both parallel and not-parallel models. This can be done by having SummaryGraph convert layer-names it receives in the API to data_parallel=False using normalize_layer_name. When returning results, use the de-normalized format. * Fix the documentation error from issue 36 * Move some logs to debug and show in logging.conf how to enable DEBUG logs.
-
- Sep 03, 2018
-
-
Guy Jacob authored
* Implemented as a Policy * Integrated in image classification sample * Updated docs and README
-
- Jul 31, 2018
-
-
Haim Barad authored
Enabling Early Exit strategy in image classifier example
-
- Jul 22, 2018
-
-
Gal Novik authored
* Adding PACT quantization method * Move logic modifying the optimizer due to changes the quantizer makes into the Quantizer itself * Updated documentation and tests
-
- Jul 17, 2018
-
-
Guy Jacob authored
* Add Quantizer unit tests * Require 'bits_overrides' to be OrderedDict to support overlapping patterns in a predictable manner + update documentation to reflect this * Quantizer class cleanup * Use "public" nn.Module APIs instead of protected attributes * Call the builtins set/get/delattr instead of the class special methods (__***__) * Fix issues reported in #24 * Bug in RangeLinearQuantParamLayerWrapper - add explicit override of pre_quantized_forward accpeting single input (#15) * Add DoReFa test to full_flow_tests
-
- Jul 01, 2018
-
-
Guy Jacob authored
* Scale of bias and parentheses were wrong
-
- Jun 22, 2018
-
-
Thomas Fan authored
Reviewed and looking good. We have to set a convention for naming files.
-
- Jun 21, 2018
- Jun 14, 2018
-
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
-
- May 22, 2018
-
-
Neta Zmora authored
Two places in the documentation gave the wrong path to the example Alexnet sensitivity pruning schedule.
-
- May 14, 2018
-
-
Guy Jacob authored
-
- May 13, 2018
-
-
Neta Zmora authored
-
- Apr 30, 2018
-
-
Guy Jacob authored
-
- Apr 28, 2018
-
-
Neta Zmora authored
-
- Apr 24, 2018
-
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
-