- Oct 31, 2019
-
-
Guy Jacob authored
* Add blacklist to quantizer. In PTQ put Dropout on the blacklist. * Update notebooks to use 2-phase stats collection * Other small fixes
-
- Jul 22, 2019
-
-
Guy Jacob authored
The PyTorch trace mechanism doesn't "see" torch.nn.ModuleList modules (since they don't have a forward function). As a result, the mapping from module names at the Python model definition level to the scope-names at the trace level is not 1:1. This makes it impossible for us to map back from SummaryGraph ops to their respective nn.Modules, which is required for flows like BatchNorm folding and stats fusion in post-training quantization. In #313 we handled this issue specifically in DistillerLSTM, but it makes much more sense to have a generic and automatic solution for this issue, which doesn't require the user to modify the model. This is such a solution. * Implemented DistillerModuleList, a replacement for nn.ModuleList which results in full and unique scope-names * See documentation for this class in summary_graph.py for extensive details on the issue and solution * When generating a SummaryGraph, the model is scanned and all instances of torch.nn.ModuleList are replaced with DistillerModulelist * Add tests for new functionality * Partially revert changes made to DistillerLSTM in commit 43548deb: Keep the refactored _create_cells_list function, but have it create a standard torch.nn.ModuleList (since we're the ModuleList issue automatically now, and no need to confuse users with ad-hoc list implementations
-
- Jul 10, 2019
-
-
Guy Jacob authored
* "Net-aware quantization" - using the term coined in https://arxiv.org/abs/1811.09886. (section 3.2.2). Refers to considering sequences of modules when quantizing. This isn't exactly layer fusion - we modify activation stats prior to setting quantization parameters, to make sure that when a module is followed by certain activation functions, only the relevant ranges are quantized. We do this for: * ReLU - Clip all negative values * Tanh / Sigmoid - Clip according to the (approximated) saturation values for these functions. We use [-4, 4] for tanh and [-6, 6] for sigmoid. * Perform batch-norm folding before post-training quantization. Batch-norm parameters are folded into the parameters of the previous layer and the BN layer is replaced with an identity module. * Both BN folding and "net-aware" are now automatically executed in PostTrainLinearQuantizer (details of this change below) * BN folding enabled by new generic mechanism to "fuse" module sequences (at the Python API level) * First module in sequence is replaced/modified by a user-provided function, rest of moudles replaced with nn.Identity * Quantizer changes: * Optionally create adjacency map during prepare_model * Subclasses may enforce adjacency map creation * Refatcoring: Replace _prepare_model_impl with pre and post override-able "callbacks", so core functionality is always executed * PostTrainLinearQuantizer Changes: * Enforce creation of adjacency map. This means users must now pass a dummy input to PostTrainLinearQuantizer.prepare_model * Before module replacement - Apply BN folding and stats updates according to net-aware quantization * Updated the language model quantization tutorial to reflect the new functionality * Updated the image classification post-train quantization samples (command line and YAML) * Other changes: * Distller LSTM implementation: Replace the ModuleList for cells with a plain list. The PyTorch trace mechanism doesn't "see" ModuleList objects, it only sees the contained modules. This means that the "scopeName" of these modules isn't complete, which makes it impossible to match op names in SummaryGraph to modules in the Python model. * ActivationStatsCollector: Ignore nn.Identity modules
-
- Apr 16, 2019
-
-
Lev Zlotnik authored
* Introduce a modular, Python-level implementation of LSTM/LSTMCell using existing PyTorch nn.Modules as building blocks * This allows quantization of weights and internal activations of LSTM layers using the existing Quantizer. (In the PyTorch implementation of RNN/LSTM only the weights are exposed at the Python level, whereas the internal activations are "hidden" in C++ code.) * Supports stacked (multi-layer) and bi-directional LSTM * Implemented conversion functions from PyTorch LSTM module to our LSTM module and vice-versa * Tests for modular implementation correctness and for conversions * Jupyter notebook showing post-training quantization of a language model
-
- Apr 08, 2019
-
-
Lev Zlotnik authored
-
- Mar 12, 2019
-
-
Bar authored
"Peformance" --> "Performance"
-
- Mar 05, 2019
-
-
Neta Zmora authored
-
- Feb 26, 2019
-
-
Lev Zlotnik authored
Not backward compatible - re-installation is required * Fixes for PyTorch==1.0.0 * Refactoring folder structure * Update installation section in docs
-
- Aug 09, 2018
-
-
Guy Jacob authored
* Instead of a single additive value (which so far represented only the regularizer loss), callbacks return a new overall loss * Policy callbacks also return the individual loss components used to calculate the new overall loss. * Add boolean flag to the Scheduler's callback so applications can choose if they want to get individual loss components, or just the new overall loss * In compress_classifier.py, log the individual loss components * Add test for the loss-from-callback flow
-
- Jul 13, 2018
-
-
Neta Zmora authored
This is a merge of the ADC branch and master. ADC (using a DDPG RL agent to compress image classifiers) is still WiP and requires An unreleased version of Coach (https://github.com/NervanaSystems/coach). Small features in this commit: -Added model_find_module() - find module object given its name - Add channel ranking and pruning: pruning/ranked_structures_pruner.py - Add a CIFAR10 VGG16 model: models/cifar10/vgg_cifar.py - Thinning: change the level of some log messages – some of the messages were moved to ‘debug’ level because they are not usually interesting. - Add a function to print nicely formatted integers - distiller/utils.py - Sensitivity analysis for channels-removal - compress_classifier.py – handle keyboard interrupts - compress_classifier.py – fix re-raise of exceptions, so they maintain call-stack -Added tests: -- test_summarygraph.py: test_simplenet() - Added a regression test to target a bug that occurs when taking the predecessor of the first node in a graph -- test_ranking.py - test_ch_ranking, test_ranked_channel_pruning -- test_model_summary.py - test_png_generation, test_summary (sparsity/ compute/model/modules) - Bug fixes in this commit: -- Thinning bug fix: handle zero-sized 'indices' tensor During the thinning process, the 'indices' tensor can become zero-sized, and will have an undefiend length. Therefore, we need to check for this situation when assessing the number of elements in 'indices' -- Language model: adjust main.py to new distiller.model_summary API
-
- Jun 21, 2018
-
-
Guy Jacob authored
-
- Jun 14, 2018
-
-
Neta Zmora authored
-
- Jun 13, 2018
-
-
Neta Zmora authored
Replace the original "homebrew" optimizer and LR-decay schedule with PyTorch's SGD and ReduceLROnPlateau. SGD with momentum=0 and weight_decay=0, and ReduceLROnPlateau with patience=0 and factor=0.5 will give the same behavior as in the original PyTorch example. Having a standard optimizer and LR-decay schedule gives us the flexibility to experiment with these during the training process.
-
- Jun 07, 2018
-
-
Neta Zmora authored
Added an implementation of Baidu’s RNN pruning scheme: Narang, Sharan & Diamos, Gregory & Sengupta, Shubho & Elsen, Erich. (2017). Exploring Sparsity in Recurrent Neural Networks. (https://arxiv.org/abs/1704.05119) Added an example of word-level language model compression. The language model is based on PyTorch’s example: https://github.com/pytorch/examples/tree/master/word_language_model Added an AGP pruning schedule and RNN pruning schedule to demonstrate compression of the language model.
-