Commits · 9405679f40d32498d7a6a46bbfb17830b470bc99 · llvm / distiller · GitLab

Snippets Groups Projects

May 15, 2019

Activation Histograms (#254) · 9405679f

Guy Jacob authored 5 years ago

Added a collector for activation histograms (sub-class of
ActivationStatsCollector). It is stats-based, meaning it requires
pre-computed min/max stats per tensor. This is done in order to prevent
the need to save all of the activation tensors throughout the run.
The stats are expected in the format generated by
QuantCalibrationStatsCollector.

Details:

* Implemented ActivationHistogramsCollector
* Added Jupyter notebook showcasing activation histograms
* Implemented helper function that performs the stats collection pass
  and histograms pass in one go
* Also added separate helper function just for quantization stats
  collection
* Integrated in image classification sample
* data_loaders.py: Added option to have a fixed subset throughout
  within the same session. Using it to keep the same subset between
  the stats collection and histograms collection phases.
* Other changes:
  * Calling assign_layer_fq_names in base-class of collectors. We do
    this since the collectors, as implemented so far, assume this is
    done. So makes sense to just do it in the base class instead of
    expecting the user to do it.
  * Enforcing a non-parallel model for quantization stats and
    histograms collectors
  * Jupyter notebooks - add utility function to enable loggers in
    notebooks. This allows us to see any logging done by Distiller
    APIs called from notebooks.

9405679f

SummaryGraph changes: _force_outplace + OrderedDicts · f1f0d753

Guy Jacob authored 5 years ago

* Set _force_outplace when calling get_trace_graph. This is a
  workaround for losing scope information for certain in-place
  operations
* Switch all dicts to OrderedDicts

f1f0d753

May 14, 2019
- Fix failing test after last post-train quant change · f06662d3
  Guy Jacob authored 5 years ago
  
  f06662d3
- Improved logging when saving collectors data (#251) · ab8d2960
  Bar authored 5 years ago
  
  ab8d2960
- Post train quant - warn instead of exit when dynamic quant not supported · 34e5d20f
  Guy Jacob authored 5 years ago
  
  34e5d20f
May 06, 2019

Fix broken load test (#245) · 1f48fa64

Bar authored 5 years ago

In a former commit, distiller accepts checkpoints that do not contain
'optimizer' argument. However, this change was not reflected in the
relevant test.

1f48fa64

May 05, 2019

Allow loading checkpoints not containing an optimizer · 09d2eea3

Neta Zmora authored 5 years ago

Support loading a model from a checkpoint file that does not
have an Optimizer instance.
Before the change, loading such a model required using
```load_lean_checkpoint```  (or --exp-load-weights-from
From the compress_classifier.py command-line), therefore
this change is for convenience only.

09d2eea3

May 02, 2019
- Quantizer: Proper handling of modules that point to same object (#239) · a69dd5d6
  Lev Zlotnik authored 5 years ago
  
  a69dd5d6
May 01, 2019
- Update README · 343e9a82
  Neta Zmora authored 5 years ago
  
  Added a link to the FAQ wiki page.
  Unverified
  
  343e9a82
- Post-train quant: Ensure quant params are located on correct device (#241) · fac2359b
  Lev Zlotnik authored 5 years ago
  
  fac2359b
Apr 30, 2019

Added PackedSequence functionality (#236) · 92fd0019

Lev Zlotnik authored 5 years ago

* Update test_lstm_impl.py

* Added PackedSequence functionality

* Refactored forward implementation

92fd0019

Apr 18, 2019

Remove single worker limitation in deterministic mode (#227) · 8c5de42c

Bar authored 5 years ago

Also:
* Single worker limitation not needed anymore, been fixed in PyTorch
  since v0.4.0 (https://github.com/pytorch/pytorch/pull/4640)
* compress_classifier.py: If run in evaluation mode (--eval), enable
  deterministic mode.
* Call utils.set_deterministic at data loaders creation if
  deterministic argument is set (don't assume user calls it outside)
* Disable CUDNN benchmark mode in utils.set_deterministic
  (https://pytorch.org/docs/stable/notes/randomness.html#cudnn)

8c5de42c

Apr 16, 2019

LSTM: Modular implementation + Post-Train Quantization Sample (#196) · a3c8d86f

Lev Zlotnik authored 5 years ago

* Introduce a modular, Python-level implementation of LSTM/LSTMCell
  using existing PyTorch nn.Modules as building blocks
* This allows quantization of weights and internal activations of
  LSTM layers using the existing Quantizer. 
  (In the PyTorch implementation of RNN/LSTM only the weights are 
  exposed at the Python level, whereas the internal activations are 
  "hidden" in C++ code.)
* Supports stacked (multi-layer) and bi-directional LSTM
* Implemented conversion functions from PyTorch LSTM module to
  our LSTM module and vice-versa
* Tests for modular implementation correctness and for conversions
* Jupyter notebook showing post-training quantization of a language
  model

a3c8d86f

ActivationStatsCollectors: Handle non-contiguous tensors (#228) · 52c4d0b0
Lev Zlotnik authored 5 years ago

52c4d0b0

Apr 14, 2019

Post-train quant: Extend acts clipping functionality (#225) · 437e270b

Guy Jacob authored 5 years ago

* Some refactoring to enable multiple clipping methods
* BREAKING: clip_acts as a boolean flag (either in command line
  or in function signature) will fail. Error message with valid
  values from is displayed.
* Implemented clipping activations at mean + N * std
  (N is user configurable)
* Additional tests
* Updated docs

437e270b

Docs: Fix broken images and links · b9207bf7
Guy Jacob authored 5 years ago

b9207bf7

Apr 11, 2019

save_checkpoint(): Save dict of user values (Breaking) (#219) · 158602c5

Guy Jacob authored 5 years ago

* Replace the optional 'best_top1' parameter with a generic
  optional dict which the caller can populate as needed.
* Saved in the checkpoint under the key 'extras'

158602c5

Add functionality to log values of buffers in a model (#220) · 2b6b7251

Guy Jacob authored 5 years ago

* In all logger types (PythonLogger, TensorBoardLogger, CSVLogger)
* Exact behavior varies per logger type and documented in the code.
* To enable in CSVLogger, changed its API to take a file name prefix
  (optionally empty) instead of the full name, and use a hard-coded 
  name for logging weights sparsity.
* Also fixed signature of log_training_progress in base DataLogger
  class to match the signature used in the sub-classes.

2b6b7251

Apr 09, 2019

SummaryGraph - fix MACs calculation for grouped-convolutions · 1a8c6bb8
Neta Zmora authored 5 years ago
```
Also added tests
```
1a8c6bb8

Relaxation of the SummaryGraph API (#212) · d08c3734

Bar authored 5 years ago

This commit simplifies the SummaryGraph API,
by removing from the client to burden to handle 
the differences between models with/without 
DataParallel layers.

DataParallel layers in PyTorch change the fully-qualified
names (FQNs) of PyTorch modules.  A module's FQN
unambiguously identifies a module within a model, by 
encoding the path to the module from the root of the 
model.  For example, ```module.layer2.1.conv1``` and 
```module.layer2.0.conv1``` are FQNs of two different
modules named ```conv1``` in some module.  
Because a module's FQN reflects the module's hierarchy,
adding/removing a DataParallel node also changes its FQN.

Distiller uses FQNs to refer to modules and parameters 
(e.g. from YAML files), and non-functional changes to the 
model hierarchy, such as using DataParallel modules are
handled by converting FQNs using `
``utils.{de,}normalize_module_name()```.

Before this commit, the SummaryGraph API assumed that 
the API client will convert layers names using 
```utils.normalize_module_name()``` before invoking the API.
This led to needlessly verbose client code, which was also
error-prone and harder to read and maintain.
This commit fixes these short-comings by relaxing the API, 
and handling the FQNN naming differences internally.

The thinning implementation is simplified somewhat
by refactoring to the new APIs lenient requirements.

Added named_params_layers method to SummaryGraph
that yields a 3-tuple of: layer name, param name, and param.
When using the new method, summary graph communicates the
true layer name in respect to the model it was initiated with.

d08c3734

Apr 08, 2019

Documentation: add missing images · ce082d5e

Neta Zmora authored 5 years ago

Unnfortunately, we maintain 2 copies of documentation images (one
for the documentation source; another for the generated documentation).
We need to solve this as it makes the repository size unproportionally
large.

ce082d5e

Refine pruning logic (#222) · 816a943d

Neta Zmora authored 5 years ago

Add finer control over the pruning logic, to accommodate more pruning
use-cases.
The full description of the new logic is available in the updated [documentation
of the CompressionScheduler](https://nervanasystems.github.io/distiller/schedule.html#pruning-fine-control), which is also part of this PR.

In this PR:

* Added a new callback to the CompressionScheduler:
compression_scheduler.before_parameter_optimization which is invoked
after the gradients are are computed, but before the weights are updated
by the optimizer.

* We provide an option to mask the gradients, before the weights are updated by the optimizer. 
We register to the parameter backward hook in order to mask the gradients.
This gives us finer control over the parameter updates.

* Added several DropFilter schedules.
DropFilter is a method to regularize networks, and it can also be
used to "prepare" a network for permanent filter pruning.

*Added documentation of pruning fine-control

816a943d

Proper handling of 0s in DoReFa 1-bit weights (#205) · 87d7c6ce

tacker-oh authored 5 years ago

Fixes #198.
Previously 0s were being mapped to 0, effectively yielding a third 
quantization level. This fix maps 0s to 1.

87d7c6ce

Removed sys.path modifications when importing distiller. (#224) · 72ef9160
Lev Zlotnik authored 5 years ago

Unverified

72ef9160

Fix issue #213 (#221) · 73b3b3cf

Neta Zmora authored 5 years ago

Dropout layers were not handled properly in SummaryGraph, and
caused the indexing of layer names to change.
The root cause is that in ONNX uses the same node name for
Dropout and Linear layers that are processed in sequence.
ONNX nodes can be identified by three components: the ONNX 
node name,  type, and instance.
In SummaryGraph we ignore the node type when naming a node.
Specifically in AlexNet, nodes the Dropout layers before a Linear
layer have the same node name and instance, and are only distinguished
by their type.  SummaryGraph, ignorant of the type, skipped the Dropout
layers and gave SG nodes the wrong name.  Thus 'classifier.0', which is
a Dropout node, became a Linear node.
The fix is not to ignore duplicate (node name, instance) pairs
by incrementing the instance.

73b3b3cf

Apr 04, 2019
- Add FP16 support in post-training quantization (#216) · 429b6a39
  Lev Zlotnik authored 5 years ago
  
  429b6a39
Apr 03, 2019

Bugfix in create_model() · 807647dc

Guy Jacob authored 5 years ago

Don't pass 'pretrained' parameter to our internal ImageNet models
(aka 'imagenet_extra_models') - they don't support it, and in any
case it's redundant as they're created only if pretrained == False

807647dc

Update and organize .gitignore · 6a71da70
Guy Jacob authored 6 years ago

Unverified

6a71da70

Apr 01, 2019

Quantizer: Specify # bias bits + custom overrides (BREAKING) (#178) · 5271625a

Lev Zlotnik authored 6 years ago

* Bias handling:
  * Add 'bits_bias' parameter to explicitly specify # of bits for bias,
    similar to weights and activations.
  * BREAKING: Remove the now redundant 'quantize_bias' boolean parameter
* Custom overrides:
  * Expand the semantics of the overrides dict to allow overriding of
    other parameters in addition to bit-widths
  * Functions registered in the quantizer's 'replacement_factory' can
    define keyword arguments. Non bit-width entries in the overrides
    dict will be checked against the function signature and passed
  * BREAKING:
    * Changed the name of 'bits_overrides' to simply 'overrides'
    * Bit-width overrides must now be defined using the full parameter
      names - 'bits_activations/weights/bias' instead of the short-hands
      'acts' and 'wts' which were used so far.
  * Added/updated relevant tests
  * Modified all quantization YAMLs under 'examples' to reflect 
    these changes
  * Updated docs

5271625a

setup.py - correct distiller version · acaf477b
Neta Zmora authored 6 years ago
```
Fix copy-paste mistake
```
Unverified

acaf477b
setup.py - fix recursive import · 118d7566
Neta Zmora authored 6 years ago
```
The code that installs distiller tries to import distiller.
```
Unverified

118d7566
Change distiller version to "0.4.0-pre" · 7293e3f0
Neta Zmora authored 6 years ago

Unverified

7293e3f0
Add ResNeXt101 pruning example (#201) · 81f5c0c7
Bar authored 6 years ago

81f5c0c7

Load optimizer from checkpoint (BREAKING - see details) (#182) · 992291cf

Bar authored 6 years ago

Load optimizer from checkpoint (BREAKING - see details) (#182)

* Fixes issues #70, #145 and replaces PR #74
* checkpoint.py
  * save_checkpoint will now save the optimizer type in addition to
    its state
  * load_checkpoint will now instantiate an optimizer based on the
    saved type and load its state
* config.py: file/dict_config now accept the resumed epoch to pass to
  LR schedulers
* policy.py: LRPolicy now passes the current epoch to the LR scheduler
* Classifier compression sample
  * New flag '--resume-from' for properly resuming a saved training
    session, inc. optimizer state and epoch #
  * Flag '--reset-optimizer' added to allow discarding of a loaded
    optimizer.
  * BREAKING:
    * Previous flag '--resume' is deprecated and is mapped to
      '--resume-from' + '--reset-optimizer'. 
    * But, old resuming behavior had an inconsistency where the epoch
      count would continue from the saved epoch, but the LR scheduler
      was setup as if we were starting from epoch 0.
    * Using '--resume-from' + '--reset-optimizer' now will simply
      RESET the epoch count to 0 for the whole environment.
    * This means that scheduling configurations (in YAML or code)
      which assumed use of '--resume' might need to be changed to
      reflect the fact that the epoch count now starts from 0
    * All relevant YAML files under 'examples' modified to reflect
      this change
* Initial support for ReduceLROnPlateau (#161):
  * Allow passing **kwargs to policies via the scheduler
  * Image classification now passes the validation loss to the
    scheduler, to be used yo ReduceLROnPlateau
  * The current implementation is experimental and subject to change

992291cf

Updated distiller version in setup.py script · cef7fc36
Lev Zlotnik authored 6 years ago

View commits for tag v0.3.1 v0.3.1 Unverified

cef7fc36
change version to 0.3.1 · 92c7822b
Neta Zmora authored 6 years ago

Unverified

92c7822b

Thinning: fix param_name_2_layer_name · 148b7474

Neta Zmora authored 6 years ago

This fix does not change the behavior. 
The previous code worked correctly because 'weights' and '.weight' have the same length.

148b7474

Mar 31, 2019
- Post train quant - Add wrapper for embedding layer · 98b54695
  Guy Jacob authored 6 years ago
  
  98b54695
Mar 29, 2019
- Fixed a typo in te quantization documentation (#207) · f5987f9a
  Songyi Blair Han authored 6 years ago
  
  f5987f9a
Mar 28, 2019

Added distiller.utils.convert_recursively_to (#209) · 119f7601

Lev Zlotnik authored 6 years ago

* Added distiller.utils.convert_recursively_to , replaced _treetuple2device in SummaryGraph with it.

* Renamed to convert_tensors_recursively_to

119f7601