Commits · 0c175c9441074ca75b4ac75c8c688e7cd6af6d68 · llvm / distiller

Oct 31, 2019

Re-enable NLP quant notebooks after following #402 (#412) · fdb12eb1

Guy Jacob authored 5 years ago

* Add blacklist to quantizer. In PTQ put Dropout on the blacklist.
* Update notebooks to use 2-phase stats collection
* Other small fixes

fdb12eb1

Jul 22, 2019

Fix non 1:1 mapping between model w. ModuleList and SummaryGraph (#328) · b614330c

Guy Jacob authored 5 years ago

The PyTorch trace mechanism doesn't "see" torch.nn.ModuleList modules
(since they don't have a forward function). As a result, the mapping
from module names at the Python model definition level to the
scope-names at the trace level is not 1:1. This makes it impossible for
us to map back from SummaryGraph ops to their respective nn.Modules,
which is required for flows like BatchNorm folding and stats fusion in
post-training quantization.

In #313 we handled this issue specifically in DistillerLSTM, but it
makes much more sense to have a generic and automatic solution for this
issue, which doesn't require the user to modify the model. This is such
a solution.
    
* Implemented DistillerModuleList, a replacement for nn.ModuleList
  which results in full and unique scope-names
* See documentation for this class in summary_graph.py for extensive
  details on the issue and solution
* When generating a SummaryGraph, the model is scanned and all instances
  of torch.nn.ModuleList are replaced with DistillerModulelist
* Add tests for new functionality
* Partially revert changes made to DistillerLSTM in commit 43548deb:
  Keep the refactored _create_cells_list function, but have it create
  a standard torch.nn.ModuleList (since we're the ModuleList issue
  automatically now, and no need to confuse users with ad-hoc list 
  implementations

b614330c

Jul 10, 2019

Post-Train Quantization: BN folding and "net-aware quantization" (#313) · 43548deb

Guy Jacob authored 5 years ago

* "Net-aware quantization" - using the term coined in
  https://arxiv.org/abs/1811.09886. (section 3.2.2).
  Refers to considering sequences of modules when quantizing. This 
  isn't exactly layer fusion - we modify activation stats prior to
  setting quantization parameters, to make sure that when a module
  is followed by certain activation functions, only the relevant
  ranges are quantized. We do this for:
    * ReLU - Clip all negative values
    * Tanh / Sigmoid - Clip according to the (approximated) saturation
      values for these functions. We use [-4, 4] for tanh and [-6, 6]
      for sigmoid.

* Perform batch-norm folding before post-training quantization.
  Batch-norm parameters are folded into the parameters of the previous
  layer and the BN layer is replaced with an identity module.

* Both BN folding and "net-aware" are now automatically executed
  in PostTrainLinearQuantizer (details of this change below)

* BN folding enabled by new generic mechanism to "fuse" module
  sequences (at the Python API level)
    * First module in sequence is replaced/modified by a user-provided
      function, rest of moudles replaced with nn.Identity

* Quantizer changes:
  * Optionally create adjacency map during prepare_model
  * Subclasses may enforce adjacency map creation
  * Refatcoring: Replace _prepare_model_impl with pre and post
    override-able "callbacks", so core functionality is always executed

* PostTrainLinearQuantizer Changes:
  * Enforce creation of adjacency map. This means users must now pass a
    dummy input to PostTrainLinearQuantizer.prepare_model
  * Before module replacement - Apply BN folding and stats updates according
    to net-aware quantization

* Updated the language model quantization tutorial to reflect the new
  functionality

* Updated the image classification post-train quantization samples
  (command line and YAML)

* Other changes:
  * Distller LSTM implementation:
    Replace the ModuleList for cells with a plain list. The PyTorch trace
    mechanism doesn't "see" ModuleList objects, it only sees the 
    contained modules. This means that the "scopeName" of these modules
    isn't complete, which makes it impossible to match op names in 
    SummaryGraph to modules in the Python model.
  * ActivationStatsCollector: Ignore nn.Identity modules

43548deb

Apr 16, 2019

LSTM: Modular implementation + Post-Train Quantization Sample (#196) · a3c8d86f

Lev Zlotnik authored 5 years ago

* Introduce a modular, Python-level implementation of LSTM/LSTMCell
  using existing PyTorch nn.Modules as building blocks
* This allows quantization of weights and internal activations of
  LSTM layers using the existing Quantizer. 
  (In the PyTorch implementation of RNN/LSTM only the weights are 
  exposed at the Python level, whereas the internal activations are 
  "hidden" in C++ code.)
* Supports stacked (multi-layer) and bi-directional LSTM
* Implemented conversion functions from PyTorch LSTM module to
  our LSTM module and vice-versa
* Tests for modular implementation correctness and for conversions
* Jupyter notebook showing post-training quantization of a language
  model

a3c8d86f

Apr 08, 2019
- Removed sys.path modifications when importing distiller. (#224) · 72ef9160
  Lev Zlotnik authored 6 years ago
  
  72ef9160
Mar 12, 2019
- Fix typo in tflogger path (#186) · d59888c9
  Bar authored 6 years ago
  
  "Peformance" --> "Performance"
  d59888c9
Mar 05, 2019
- language model example: fix code formatting · 84022927
  Neta Zmora authored 6 years ago
  
  84022927
Feb 26, 2019

PyTorch 1.0.0 support + Proper Packaging (Release 0.3) (#144) · 62862a08

Lev Zlotnik authored 6 years ago

Not backward compatible - re-installation is required

* Fixes for PyTorch==1.0.0
* Refactoring folder structure
* Update installation section in docs

62862a08

Aug 09, 2018

Generalize the loss value returned from before_backward_pass callbacks (#38) · a43b9f10

Guy Jacob authored 6 years ago

* Instead of a single additive value (which so far represented only the
  regularizer loss), callbacks return a new overall loss
* Policy callbacks also return the individual loss components used to
  calculate the new overall loss.
* Add boolean flag to the Scheduler's callback so applications can choose
  if they want to get individual loss components, or just the new overall
  loss
* In compress_classifier.py, log the individual loss components
* Add test for the loss-from-callback flow

a43b9f10

Jul 13, 2018

ADC (Automatic Deep Compression) example + features, tests, bug fixes (#28) · 718f777b

Neta Zmora authored 6 years ago

This is a merge of the ADC branch and master.
ADC (using a DDPG RL agent to compress image classifiers) is still WiP and requires
An unreleased version of Coach (https://github.com/NervanaSystems/coach).

Small features in this commit:
-Added model_find_module() - find module object given its name
- Add channel ranking and pruning: pruning/ranked_structures_pruner.py
- Add a CIFAR10 VGG16 model: models/cifar10/vgg_cifar.py
- Thinning: change the level of some log messages – some of the messages were
moved to ‘debug’ level because they are not usually interesting.
- Add a function to print nicely formatted integers - distiller/utils.py
- Sensitivity analysis for channels-removal
- compress_classifier.py – handle keyboard interrupts
- compress_classifier.py – fix re-raise of exceptions, so they maintain call-stack

-Added tests:
-- test_summarygraph.py: test_simplenet() - Added a regression test to target a bug that occurs when taking the predecessor of the first node in a graph
-- test_ranking.py - test_ch_ranking, test_ranked_channel_pruning
-- test_model_summary.py - test_png_generation, test_summary (sparsity/ compute/model/modules)

- Bug fixes in this commit:
-- Thinning bug fix: handle zero-sized 'indices' tensor
During the thinning process, the 'indices' tensor can become zero-sized,
and will have an undefiend length. Therefore, we need to check for this
situation when assessing the number of elements in 'indices'
-- Language model: adjust main.py to new distiller.model_summary API

718f777b

Jun 21, 2018
- Training with quantization (#8) · 5bb9e138
  Guy Jacob authored 6 years ago
  
  5bb9e138
Jun 14, 2018
- language model: remove dead code · a83f4872
  Neta Zmora authored 6 years ago
  
  a83f4872
Jun 13, 2018

Language model: replace the optimizer and LR-decay scheduler · a9b28923

Neta Zmora authored 6 years ago

Replace the original "homebrew" optimizer and LR-decay schedule with
PyTorch's SGD and ReduceLROnPlateau.
SGD with momentum=0 and weight_decay=0, and ReduceLROnPlateau with
patience=0 and factor=0.5 will give the same behavior as in the
original PyTorch example.

Having a standard optimizer and LR-decay schedule gives us the
flexibility to experiment with these during the training process.

a9b28923

Jun 07, 2018

Word-level language model compression · 52658b87

Neta Zmora authored 6 years ago

Added an implementation of Baidu’s RNN pruning scheme:
Narang, Sharan & Diamos, Gregory & Sengupta, Shubho & Elsen, Erich. (2017).
    Exploring Sparsity in Recurrent Neural Networks.
    (https://arxiv.org/abs/1704.05119)

Added an example of word-level language model compression.
The language model is based on PyTorch’s example:
https://github.com/pytorch/examples/tree/master/word_language_model

Added an AGP pruning schedule and RNN pruning schedule to demonstrate
compression of the language model.

52658b87