Commits · 66c0ad1d82992d5c5702c0002bcab15e7e64a46b · llvm / distiller · GitLab

Snippets Groups Projects

May 19, 2019

Post-training quantization: Scale factor approximation (#261) · 66c0ad1d

Guy Jacob authored 5 years ago

* Added scale factor approximation in post-training quantization using
  integer multiply + shift. # of bits for integer multiplier is user
  configurable
* Updated documentation
* Updated post-train quant command line examples readme file

66c0ad1d

model_summaries.py: add simple Conv layer annotations to PNG export · b0cd85cb

Neta Zmora authored 5 years ago

1.	Add basic annotations to Conv layers when generating model PNG diagrams
2.	Refactor: replace dataset_dummy_input with the global utility distiller.get_dummy_input

b0cd85cb

May 16, 2019

Refactoring: utils.get_dummy_input() · bf1e6a0d
Neta Zmora authored 5 years ago
```
Remove the multiple instances of code that generates
dummy input per dataset.
```
bf1e6a0d
Fix test_onnx.py · af5c7219
Neta Zmora authored 5 years ago
```
A wrong model was used
```
af5c7219
compress_classifier.py: --summary related fixes · 2ef3eeb6
Neta Zmora authored 5 years ago
```
The previous PR merge introduced a couple of small errors when
using the --summary flag.
```
2ef3eeb6
Merge branch 'master' of https://github.com/NervanaSystems/distiller · f7145223
Neta Zmora authored 5 years ago

f7145223

Refactor export to ONNX functionality (#258) · 54304810

Bar authored 5 years ago

Introduced a new utility function to export image-classifiers
to ONNX: export_img_classifier_to_onnx.
The functionality is not new, just refactored.

In the sample application compress_classifier.py added 
--export-onnx as a stand-alone cmd-line flag for specifically exporting 
ONNX models.
This new flag can take an optional argument which is used to name the
exported onnx model file.
The option to export models was removed from the –summary argument.
Now we allow multiple --summary options be called together.

Added a basic test for exporting ONNX.

54304810

collector.py: Switch matplotlib backend to one w/o dependencies · 4414bc5b
Guy Jacob authored 5 years ago

4414bc5b

May 15, 2019

SummaryGraph: fix ‘weights_vol’ attribute for conv and linear layers · 08b5cd95

Neta Zmora authored 5 years ago

The weights_vol attribute reflects the size (volume) of an SG node’s
weights tensor.  The calculation of the weights volume was wrong.
This does not have any significant impact because this attribute is
not used.

08b5cd95

Revert "SummaryGraph: fix ‘weights_vol’ attribute for conv and linear layers" · a0ebeb7e
Neta Zmora authored 5 years ago
```
This reverts commit a3f2ce2d.
```
a0ebeb7e

Added convert_model_to_distiller_lstm (#259) · 12d6b7e2

Lev Zlotnik authored 5 years ago

* Traverses recursively through entire model and replaces all submodules of type `nn.LSTM` and `nn.LSTMCell` with distiller versions

12d6b7e2

Merge branch 'master' of https://github.com/NervanaSystems/distiller · 3d53dc64
Neta Zmora authored 5 years ago

3d53dc64

SummaryGraph: fix ‘weights_vol’ attribute for conv and linear layers · a3f2ce2d

Neta Zmora authored 5 years ago

The weights_vol attribute reflects the size (volume) of an SG node’s
weights tensor.  The calculation of the weights volume was wrong.
This does not have any significant impact because this attribute is
not used.
wq

a3f2ce2d

Activation Histograms (#254) · 9405679f

Guy Jacob authored 5 years ago

Added a collector for activation histograms (sub-class of
ActivationStatsCollector). It is stats-based, meaning it requires
pre-computed min/max stats per tensor. This is done in order to prevent
the need to save all of the activation tensors throughout the run.
The stats are expected in the format generated by
QuantCalibrationStatsCollector.

Details:

* Implemented ActivationHistogramsCollector
* Added Jupyter notebook showcasing activation histograms
* Implemented helper function that performs the stats collection pass
  and histograms pass in one go
* Also added separate helper function just for quantization stats
  collection
* Integrated in image classification sample
* data_loaders.py: Added option to have a fixed subset throughout
  within the same session. Using it to keep the same subset between
  the stats collection and histograms collection phases.
* Other changes:
  * Calling assign_layer_fq_names in base-class of collectors. We do
    this since the collectors, as implemented so far, assume this is
    done. So makes sense to just do it in the base class instead of
    expecting the user to do it.
  * Enforcing a non-parallel model for quantization stats and
    histograms collectors
  * Jupyter notebooks - add utility function to enable loggers in
    notebooks. This allows us to see any logging done by Distiller
    APIs called from notebooks.

9405679f

SummaryGraph changes: _force_outplace + OrderedDicts · f1f0d753

Guy Jacob authored 5 years ago

* Set _force_outplace when calling get_trace_graph. This is a
  workaround for losing scope information for certain in-place
  operations
* Switch all dicts to OrderedDicts

f1f0d753

May 14, 2019
- Fix failing test after last post-train quant change · f06662d3
  Guy Jacob authored 5 years ago
  
  f06662d3
- Improved logging when saving collectors data (#251) · ab8d2960
  Bar authored 5 years ago
  
  ab8d2960
- Post train quant - warn instead of exit when dynamic quant not supported · 34e5d20f
  Guy Jacob authored 5 years ago
  
  34e5d20f
May 06, 2019
- Merge branch 'master' of https://github.com/NervanaSystems/distiller · 1e7e9835
  Neta Zmora authored 5 years ago
  
  1e7e9835
- Fix broken load test (#245) · 1f48fa64
  Bar authored 5 years ago
  
  In a former commit, distiller accepts checkpoints that do not contain 'optimizer' argument. However, this change was not reflected in the relevant test.
  1f48fa64
May 05, 2019

AMC: Revive support for weights-channels removal · d8c97cdd
Neta Zmora authored 5 years ago
```
This is in contrast to weights-filters removal
```
d8c97cdd

Allow loading checkpoints not containing an optimizer · 09d2eea3

Neta Zmora authored 5 years ago

Support loading a model from a checkpoint file that does not
have an Optimizer instance.
Before the change, loading such a model required using
```load_lean_checkpoint```  (or --exp-load-weights-from
From the compress_classifier.py command-line), therefore
this change is for convenience only.

09d2eea3

May 02, 2019
- Quantizer: Proper handling of modules that point to same object (#239) · a69dd5d6
  Lev Zlotnik authored 5 years ago
  
  a69dd5d6
May 01, 2019
- Update README · 343e9a82
  Neta Zmora authored 5 years ago
  
  Added a link to the FAQ wiki page.
  Unverified
  
  343e9a82
- Post-train quant: Ensure quant params are located on correct device (#241) · fac2359b
  Lev Zlotnik authored 5 years ago
  
  fac2359b
Apr 30, 2019

Added PackedSequence functionality (#236) · 92fd0019

Lev Zlotnik authored 5 years ago

* Update test_lstm_impl.py

* Added PackedSequence functionality

* Refactored forward implementation

92fd0019

Apr 18, 2019

Remove single worker limitation in deterministic mode (#227) · 8c5de42c

Bar authored 5 years ago

Also:
* Single worker limitation not needed anymore, been fixed in PyTorch
  since v0.4.0 (https://github.com/pytorch/pytorch/pull/4640)
* compress_classifier.py: If run in evaluation mode (--eval), enable
  deterministic mode.
* Call utils.set_deterministic at data loaders creation if
  deterministic argument is set (don't assume user calls it outside)
* Disable CUDNN benchmark mode in utils.set_deterministic
  (https://pytorch.org/docs/stable/notes/randomness.html#cudnn)

8c5de42c

Apr 16, 2019

LSTM: Modular implementation + Post-Train Quantization Sample (#196) · a3c8d86f

Lev Zlotnik authored 5 years ago

* Introduce a modular, Python-level implementation of LSTM/LSTMCell
  using existing PyTorch nn.Modules as building blocks
* This allows quantization of weights and internal activations of
  LSTM layers using the existing Quantizer. 
  (In the PyTorch implementation of RNN/LSTM only the weights are 
  exposed at the Python level, whereas the internal activations are 
  "hidden" in C++ code.)
* Supports stacked (multi-layer) and bi-directional LSTM
* Implemented conversion functions from PyTorch LSTM module to
  our LSTM module and vice-versa
* Tests for modular implementation correctness and for conversions
* Jupyter notebook showing post-training quantization of a language
  model

a3c8d86f

ActivationStatsCollectors: Handle non-contiguous tensors (#228) · 52c4d0b0
Lev Zlotnik authored 5 years ago

52c4d0b0

Apr 14, 2019

Post-train quant: Extend acts clipping functionality (#225) · 437e270b

Guy Jacob authored 5 years ago

* Some refactoring to enable multiple clipping methods
* BREAKING: clip_acts as a boolean flag (either in command line
  or in function signature) will fail. Error message with valid
  values from is displayed.
* Implemented clipping activations at mean + N * std
  (N is user configurable)
* Additional tests
* Updated docs

437e270b

Docs: Fix broken images and links · b9207bf7
Guy Jacob authored 5 years ago

b9207bf7

Apr 11, 2019

save_checkpoint(): Save dict of user values (Breaking) (#219) · 158602c5

Guy Jacob authored 5 years ago

* Replace the optional 'best_top1' parameter with a generic
  optional dict which the caller can populate as needed.
* Saved in the checkpoint under the key 'extras'

158602c5

Add functionality to log values of buffers in a model (#220) · 2b6b7251

Guy Jacob authored 5 years ago

* In all logger types (PythonLogger, TensorBoardLogger, CSVLogger)
* Exact behavior varies per logger type and documented in the code.
* To enable in CSVLogger, changed its API to take a file name prefix
  (optionally empty) instead of the full name, and use a hard-coded 
  name for logging weights sparsity.
* Also fixed signature of log_training_progress in base DataLogger
  class to match the signature used in the sub-classes.

2b6b7251

Apr 09, 2019

SummaryGraph - fix MACs calculation for grouped-convolutions · 1a8c6bb8
Neta Zmora authored 5 years ago
```
Also added tests
```
1a8c6bb8

Relaxation of the SummaryGraph API (#212) · d08c3734

Bar authored 5 years ago

This commit simplifies the SummaryGraph API,
by removing from the client to burden to handle 
the differences between models with/without 
DataParallel layers.

DataParallel layers in PyTorch change the fully-qualified
names (FQNs) of PyTorch modules.  A module's FQN
unambiguously identifies a module within a model, by 
encoding the path to the module from the root of the 
model.  For example, ```module.layer2.1.conv1``` and 
```module.layer2.0.conv1``` are FQNs of two different
modules named ```conv1``` in some module.  
Because a module's FQN reflects the module's hierarchy,
adding/removing a DataParallel node also changes its FQN.

Distiller uses FQNs to refer to modules and parameters 
(e.g. from YAML files), and non-functional changes to the 
model hierarchy, such as using DataParallel modules are
handled by converting FQNs using `
``utils.{de,}normalize_module_name()```.

Before this commit, the SummaryGraph API assumed that 
the API client will convert layers names using 
```utils.normalize_module_name()``` before invoking the API.
This led to needlessly verbose client code, which was also
error-prone and harder to read and maintain.
This commit fixes these short-comings by relaxing the API, 
and handling the FQNN naming differences internally.

The thinning implementation is simplified somewhat
by refactoring to the new APIs lenient requirements.

Added named_params_layers method to SummaryGraph
that yields a 3-tuple of: layer name, param name, and param.
When using the new method, summary graph communicates the
true layer name in respect to the model it was initiated with.

d08c3734

Apr 08, 2019

Documentation: add missing images · ce082d5e

Neta Zmora authored 5 years ago

Unnfortunately, we maintain 2 copies of documentation images (one
for the documentation source; another for the generated documentation).
We need to solve this as it makes the repository size unproportionally
large.

ce082d5e

Refine pruning logic (#222) · 816a943d

Neta Zmora authored 5 years ago

Add finer control over the pruning logic, to accommodate more pruning
use-cases.
The full description of the new logic is available in the updated [documentation
of the CompressionScheduler](https://nervanasystems.github.io/distiller/schedule.html#pruning-fine-control), which is also part of this PR.

In this PR:

* Added a new callback to the CompressionScheduler:
compression_scheduler.before_parameter_optimization which is invoked
after the gradients are are computed, but before the weights are updated
by the optimizer.

* We provide an option to mask the gradients, before the weights are updated by the optimizer. 
We register to the parameter backward hook in order to mask the gradients.
This gives us finer control over the parameter updates.

* Added several DropFilter schedules.
DropFilter is a method to regularize networks, and it can also be
used to "prepare" a network for permanent filter pruning.

*Added documentation of pruning fine-control

816a943d

Proper handling of 0s in DoReFa 1-bit weights (#205) · 87d7c6ce

tacker-oh authored 5 years ago

Fixes #198.
Previously 0s were being mapped to 0, effectively yielding a third 
quantization level. This fix maps 0s to 1.

87d7c6ce

Removed sys.path modifications when importing distiller. (#224) · 72ef9160
Lev Zlotnik authored 5 years ago

Unverified

72ef9160

Fix issue #213 (#221) · 73b3b3cf

Neta Zmora authored 5 years ago

Dropout layers were not handled properly in SummaryGraph, and
caused the indexing of layer names to change.
The root cause is that in ONNX uses the same node name for
Dropout and Linear layers that are processed in sequence.
ONNX nodes can be identified by three components: the ONNX 
node name,  type, and instance.
In SummaryGraph we ignore the node type when naming a node.
Specifically in AlexNet, nodes the Dropout layers before a Linear
layer have the same node name and instance, and are only distinguished
by their type.  SummaryGraph, ignorant of the type, skipped the Dropout
layers and gave SG nodes the wrong name.  Thus 'classifier.0', which is
a Dropout node, became a Linear node.
The fix is not to ignore duplicate (node name, instance) pairs
by incrementing the instance.

73b3b3cf