Commits · 0209264f75f7e5ef70ab0a4b049b19adff136d1f · llvm / distiller

May 29, 2019
- Added support from the MNIST dataset · 0209264f
  Neta Zmora authored 6 years ago
  
  Also added a simple network model for MNIST, under distiller/models/mnist.
  0209264f
May 26, 2019

Added support for setting the PRNG seed (#269) · fe27ab90

Neta Zmora authored 6 years ago

Added set_seed() to Distiller and added support for seeding the PRNG when setting --deterministic mode (prior to this change, the seed is always set to zero when running in deterministic mode.
The PRNGs of Pytorch (CPU & Cuda devices), numpy and Python are set.
Added support for ```--seed``` to classifier_compression.py.

Unverified

fe27ab90

May 21, 2019

Store qe_stats_file in output directory (#262) · 31536873

Bar authored 6 years ago

Function log_execution_env_state is used to gather information about the execution environment and store this together with the experiment log. Recently we've added saving the compression schedule YAML file in the same logs directory.
This commit expands the log_execution_env_state interface to accept a list of paths to arbitrary files that may contribute to the experiment configuration and that you (the experiment owner) deem important for recreating the experiment.

In the sample classifier_compression.py app, we now store both the compression schedule YAML file and quantization statistics collateral file (qe_stats_file).

31536873

May 16, 2019

compress_classifier.py: --summary related fixes · 2ef3eeb6
Neta Zmora authored 6 years ago
```
The previous PR merge introduced a couple of small errors when
using the --summary flag.
```
2ef3eeb6

Refactor export to ONNX functionality (#258) · 54304810

Bar authored 6 years ago

Introduced a new utility function to export image-classifiers
to ONNX: export_img_classifier_to_onnx.
The functionality is not new, just refactored.

In the sample application compress_classifier.py added 
--export-onnx as a stand-alone cmd-line flag for specifically exporting 
ONNX models.
This new flag can take an optional argument which is used to name the
exported onnx model file.
The option to export models was removed from the –summary argument.
Now we allow multiple --summary options be called together.

Added a basic test for exporting ONNX.

54304810

May 15, 2019

Activation Histograms (#254) · 9405679f

Guy Jacob authored 6 years ago

Added a collector for activation histograms (sub-class of
ActivationStatsCollector). It is stats-based, meaning it requires
pre-computed min/max stats per tensor. This is done in order to prevent
the need to save all of the activation tensors throughout the run.
The stats are expected in the format generated by
QuantCalibrationStatsCollector.

Details:

* Implemented ActivationHistogramsCollector
* Added Jupyter notebook showcasing activation histograms
* Implemented helper function that performs the stats collection pass
  and histograms pass in one go
* Also added separate helper function just for quantization stats
  collection
* Integrated in image classification sample
* data_loaders.py: Added option to have a fixed subset throughout
  within the same session. Using it to keep the same subset between
  the stats collection and histograms collection phases.
* Other changes:
  * Calling assign_layer_fq_names in base-class of collectors. We do
    this since the collectors, as implemented so far, assume this is
    done. So makes sense to just do it in the base class instead of
    expecting the user to do it.
  * Enforcing a non-parallel model for quantization stats and
    histograms collectors
  * Jupyter notebooks - add utility function to enable loggers in
    notebooks. This allows us to see any logging done by Distiller
    APIs called from notebooks.

Unverified

9405679f

May 14, 2019
- Improved logging when saving collectors data (#251) · ab8d2960
  Bar authored 6 years ago
  
  ab8d2960
Apr 18, 2019

Remove single worker limitation in deterministic mode (#227) · 8c5de42c

Bar authored 6 years ago

Also:
* Single worker limitation not needed anymore, been fixed in PyTorch
  since v0.4.0 (https://github.com/pytorch/pytorch/pull/4640)
* compress_classifier.py: If run in evaluation mode (--eval), enable
  deterministic mode.
* Call utils.set_deterministic at data loaders creation if
  deterministic argument is set (don't assume user calls it outside)
* Disable CUDNN benchmark mode in utils.set_deterministic
  (https://pytorch.org/docs/stable/notes/randomness.html#cudnn)

8c5de42c

Apr 11, 2019

save_checkpoint(): Save dict of user values (Breaking) (#219) · 158602c5

Guy Jacob authored 6 years ago

* Replace the optional 'best_top1' parameter with a generic
  optional dict which the caller can populate as needed.
* Saved in the checkpoint under the key 'extras'

Unverified

158602c5

Apr 08, 2019

Refine pruning logic (#222) · 816a943d

Neta Zmora authored 6 years ago

Add finer control over the pruning logic, to accommodate more pruning
use-cases.
The full description of the new logic is available in the updated [documentation
of the CompressionScheduler](https://nervanasystems.github.io/distiller/schedule.html#pruning-fine-control), which is also part of this PR.

In this PR:

* Added a new callback to the CompressionScheduler:
compression_scheduler.before_parameter_optimization which is invoked
after the gradients are are computed, but before the weights are updated
by the optimizer.

* We provide an option to mask the gradients, before the weights are updated by the optimizer. 
We register to the parameter backward hook in order to mask the gradients.
This gives us finer control over the parameter updates.

* Added several DropFilter schedules.
DropFilter is a method to regularize networks, and it can also be
used to "prepare" a network for permanent filter pruning.

*Added documentation of pruning fine-control

Unverified

816a943d

Apr 01, 2019

Load optimizer from checkpoint (BREAKING - see details) (#182) · 992291cf

Bar authored 6 years ago

Load optimizer from checkpoint (BREAKING - see details) (#182)

* Fixes issues #70, #145 and replaces PR #74
* checkpoint.py
  * save_checkpoint will now save the optimizer type in addition to
    its state
  * load_checkpoint will now instantiate an optimizer based on the
    saved type and load its state
* config.py: file/dict_config now accept the resumed epoch to pass to
  LR schedulers
* policy.py: LRPolicy now passes the current epoch to the LR scheduler
* Classifier compression sample
  * New flag '--resume-from' for properly resuming a saved training
    session, inc. optimizer state and epoch #
  * Flag '--reset-optimizer' added to allow discarding of a loaded
    optimizer.
  * BREAKING:
    * Previous flag '--resume' is deprecated and is mapped to
      '--resume-from' + '--reset-optimizer'. 
    * But, old resuming behavior had an inconsistency where the epoch
      count would continue from the saved epoch, but the LR scheduler
      was setup as if we were starting from epoch 0.
    * Using '--resume-from' + '--reset-optimizer' now will simply
      RESET the epoch count to 0 for the whole environment.
    * This means that scheduling configurations (in YAML or code)
      which assumed use of '--resume' might need to be changed to
      reflect the fact that the epoch count now starts from 0
    * All relevant YAML files under 'examples' modified to reflect
      this change
* Initial support for ReduceLROnPlateau (#161):
  * Allow passing **kwargs to policies via the scheduler
  * Image classification now passes the validation loss to the
    scheduler, to be used yo ReduceLROnPlateau
  * The current implementation is experimental and subject to change

992291cf

Mar 17, 2019

Replace exit()s with ValueError()s · 74a4f7ab

Neta Zmora authored 6 years ago

In several places we hit an error state and exit using exit(),
instead of raising a ValueError - fixed this.

74a4f7ab

Mar 12, 2019
- Fix typo in tflogger path (#186) · d59888c9
  Bar authored 6 years ago
  
  "Peformance" --> "Performance"
  d59888c9
Mar 06, 2019

compress_classifier.py: sort best scores by count of NNZ weights · 9cb0dd68

Neta Zmora authored 6 years ago

A recent commit changed the sorting of the best performing training
epochs to be based on the sparsity level of the model, then its
Top1 and Top5 scores.
When we create thinned models, the sparsity remains low (even zero),
while the physical size of the network is smaller.
This commit changes the sorting criteria to be based on the count
of non-zero (NNZ) parameters.  This captures both sparsity and
parameter size objectives:
- When sparsity is high, the number of NNZ params is low
(params_nnz_cnt = sparsity * params_cnt).
- When we remove structures (thinnning), the sparsity may remain
constant, but the count of params (params_cnt) is lower, and therefore,
once again params_nnz_cnt is lower.

Therefore, params_nnz_cnt is a good proxy to capture a sparsity
objective and/or a thinning objective.

9cb0dd68

Mar 03, 2019

compress_classifier.py: Fix best_epoch logic · 87055fed

Neta Zmora authored 6 years ago

Based on a commit and ideas from @barrh:
https://github.com/NervanaSystems/distiller/pull/150/commits/1623db3cdc3a95ab620e2dc6863cff23a91087bd

The sample application compress_classifier.py logs details about
the best performing epoch(s) and stores the best epoch in a checkpoint
file named ```best.pth.tar``` by default (if you use the ```--name```
application argument, the checkpoint name will be prefixed by ```best```).

Until this fix, the performance of a model was judged solely on its
Top1 accuracy. This can be a problem when performing gradual pruning
of a pre-trained model, because many times a model's Top1 accuracy
increases with light pruning and this is registered as the best performing
training epoch. However, we are really interested in the best performing
trained model _after_ the pruning phase is done. Even during training, we
may be interested in the checkpoint of the best performing model with the
highest sparsity.
This fix stores a list of the performance results from all the trained
epochs so far. This list is sorted using a hierarchical key:
(sparsity, top1, top5, epoch), so that the list is first sorted by sparsity,
then top1, followed by top5 and epoch.

But what if you want to sort using a different metric? For example, when
quantizing you may want to score the best performance by the total number of
bits used to represent the model parameters and feature-maps. In such a case
you may want to replace ```sparsity``` by this new metric. Because this is a
sample application, we don't load it with all possible control logic, and
anyone can make local changes to this logic. To keep your code separated from
the main application logic, we plan to refactor the application code sometime
in the next few months.

87055fed

compress_classifier.py: fix PNG and ONNX exports broken in new release · 6567ecec
Neta Zmora authored 6 years ago
```
Release 0.3 broke the expots to PNG and ONNX and this is the fix.
```
6567ecec

Feb 28, 2019
- Added DropFilter as a separate regularizer · b135701b
  Neta Zmora authored 6 years ago
  
  b135701b
- Drop filter + other things · a39bb904
  Neta Zmora authored 6 years ago
  
  a39bb904
Feb 26, 2019

PyTorch 1.0.0 support + Proper Packaging (Release 0.3) (#144) · 62862a08

Lev Zlotnik authored 6 years ago

Not backward compatible - re-installation is required

* Fixes for PyTorch==1.0.0
* Refactoring folder structure
* Update installation section in docs

Unverified

62862a08

Feb 17, 2019

Filter pruning: rank filters by mean value of feature-map channels · 97d5e48c

Neta Zmora authored 6 years ago

A small change to support ranking weight filters by the mean mean-value
of the feature-map channels.
Mean mean-value refers to computing the average value (across many
input images) of the mean-value of each channel.

97d5e48c

Feb 14, 2019

Store config files in logdir/configs directory (#156) · b476d028

Bar authored 6 years ago

Modified log_execution_env_state() to store
configuration file in the output directory,
under 'configs' sub-directory it creates.

At this time, the only configuration file is
passed via args.compress

b476d028

Fix automated-compression imports · ac9f61c0

Neta Zmora authored 6 years ago

To use automated compression you need to install several optional packages
which are not required for other use-cases.
This fix hides the import requirements for users who do not want to install
the extra packages.

ac9f61c0

Feb 13, 2019

Automatic RL compression + Greedy compression (#151) · 7fcf111f

Neta Zmora authored 6 years ago

Merging the 'amc' branch with 'master'.
This updates the automated compression code in 'master', and adds a greedy filter-pruning algorithm.

Unverified

7fcf111f

Feb 11, 2019

Post-train quant based on stats + additional modules quantized (#136) · 28a8ee18

Guy Jacob authored 6 years ago

Summary of changes:
(1) Post-train quantization based on pre-collected statistics
(2) Quantized concat, element-wise addition / multiplication and embeddings
(3) Move post-train quantization command line args out of sample code
(4) Configure post-train quantization from YAML for more fine-grained control

(See PR #136 for more detailed changes descriptions)

Unverified

28a8ee18

Feb 10, 2019

Load different random subset of dataset on each epoch (#149) · 4b1d0c89

Guy Jacob authored 6 years ago

* For CIFAR-10 / ImageNet only
* Refactor data_loaders.py, reduce code duplication
* Implemented custom sampler
* Integrated in image classification sample
* Since we now shuffle the test set, had to update expected results
  in 2 full_flow_tests that do evaluation

Unverified

4b1d0c89

Jan 31, 2019
- compress_classifier: add command-line option to "thinnify" a model · 2650f8f9
  Neta Zmora authored 6 years ago
  
  2650f8f9
Jan 16, 2019

compress_classifier.py refactoring (#126) · cfbc3798

Bar authored 6 years ago

* Support for multi-phase activations logging

Enable logging activation both durning training and validation at
the same session.

* Refactoring: Move parser to its own file

* Parser is moved from compress_classifier into its own file.
* Torch version check is moved to precede main() call.
* Move main definition to the top of the file.
* Modify parser choices to case-insensitive

cfbc3798

Jan 15, 2019
- Fix for CPU evaluation use-case · 81cb77d2
  Neta Zmora authored 6 years ago
  
  Fix a mismatch between the location of the model and the computation.
  81cb77d2
Jan 13, 2019
- compress_classifier.py: fix handling of --cpu application argument · 0edfb5a9
  Neta Zmora authored 6 years ago
  
  0edfb5a9
Jan 10, 2019

Enable compute (training/inference) on the CPU · 007b6903

Gal Novik authored 6 years ago

In compress_classifier.py we added a new application argument: --cpu
which you can use to force compute (training/inference) to run on the CPU 
when you invoke compress_classifier.py on a machine which has Nvidia GPUs.

If your machine lacks Nvidia GPUs, then the compute will now run on the CPU
(and you do not need the new flag).

Caveat: we did not fully test the CPU support for the code in the Jupyter 
notebooks.  If you find a bug, we apologize and appreciate your feedback.

007b6903

Dec 19, 2018

Bug fix: set the overall loss when not using a compression scheduler · f922973a

Neta Zmora authored 6 years ago

If compression_scheduler==None, then we need to set the value of
losses[OVERALL_LOSS_KEY] (so it is the same as losses[OBJECTIVE_LOSS_KEY]).
This was overlooked.

f922973a

Dec 16, 2018
- nits (#107) · f6c216db
  Taras Sereda authored 6 years ago
  
  f6c216db
Dec 14, 2018

AMC: more refactoring · 1ab288ae

Neta Zmora authored 6 years ago

Added notebook for visualizing the discovery of compressed networks.
Added one-epoch fine-tuning at the end of every episode, which is
required for very sensitive models like Plain20.

1ab288ae

Dec 11, 2018
- Earlyexit statsfix (#105) · f454fab9
  Haim Barad authored 6 years ago
  
  Revert back to Pytorch 0.4.0. Also fixed some numpy calls (for statistics) that needed to be moved back to CPU.
  f454fab9
- Save scheduler in quantize_eval checkpoint (#99) · c25d9ee2
  Yi-Syuan Chen authored 6 years ago
  
  c25d9ee2
Dec 06, 2018

Update compress_classifier.py (#97) · ecd9b139

Guangli Li authored 6 years ago

Update the examples of earlyexit arguments which were not consistent with descriptions

ecd9b139

Dec 04, 2018

Range-Based Linear Quantization Features (#95) · 907a6f04

Guy Jacob authored 6 years ago

* Asymmetric post-training quantization (only symmetric supported so until now)
* Quantization aware training for range-based (min-max) symmetric and asymmetric quantization
* Per-channel quantization support in both training and post-training
* Added tests and examples
* Updated documentation

Unverified

907a6f04

Fix the apputils.model_summaries logger configuration · 49fe4a52
Neta Zmora authored 6 years ago

49fe4a52

Dec 01, 2018

Important changes to pruning channels and filters (#93) · a0bf2a8f

Neta Zmora authored 6 years ago

This commit contains the main fix for issue #85.  It contains a couple of changes to the YAML structure pruning API, with examples.
I urge you to read the documentation in the Wiki (https://github.com/NervanaSystems/distiller/wiki/Pruning-Filters-&-Channels).

New syntax for defining Structured AGP.  I tried to make the syntax similar to fine-grained
(i.e. element-wise) pruning.  All you need to do is add: ```group_type: Filters```.
```
  low_pruner:
    class: L1RankedStructureParameterPruner_AGP
    initial_sparsity : 0.10
    final_sparsity: 0.50
    group_type: Filters
    weights: [module.layer3.0.conv2.weight,
              module.layer3.0.downsample.0.weight,
              module.layer3.1.conv2.weight,
              module.layer3.2.conv2.weight]
```

If you want to define “leader-based” pruning dependencies, add ```group_dependency: Leader```:
```
  low_pruner:
    class: L1RankedStructureParameterPruner_AGP
    initial_sparsity : 0.10
    final_sparsity: 0.50
    group_type: Filters
    group_dependency: Leader
    weights: [module.layer3.0.conv2.weight,
              module.layer3.0.downsample.0.weight,
              module.layer3.1.conv2.weight,
              module.layer3.2.conv2.weight]
```

Retired the old ```reg_regims``` API for describing one-shot structured-pruning.

The new YAML API is very similar to AGP structured-pruning, which is much better
than before.
The new API also allows us to describe data-dependencies when doing one-shot
structure pruning, just like AGP structured-pruning.

This commit also includes further code refactoring.

Old API:
```
  filter_pruner:
     class: 'L1RankedStructureParameterPruner'
     reg_regims:
       'module.layer1.0.conv1.weight': [0.6, '3D']
       'module.layer1.1.conv1.weight': [0.6, '3D']
```

New API:
```
 filter_pruner:
    class: 'L1RankedStructureParameterPruner'
    group_type: Filters
    desired_sparsity: 0.6
    weights: [
      module.layer1.0.conv1.weight,
      module.layer1.1.conv1.weight]
```

thresholding.py – separate the generation of the binary_map from the pruning_mask so that we
can cache the binary map and share it between several modules.

pruning/automated_gradual_pruner.py – major refactoring to supported “leader-based”
sub-graph pruning dependencies.  The concept is explained in issue #85


agp-pruning/resnet20_filters.schedule_agp.yaml
agp-pruning/resnet20_filters.schedule_agp_2.yaml
agp-pruning/resnet20_filters.schedule_agp_3.yaml
network_trimming/resnet56_cifar_activation_apoz.yaml
network_trimming/resnet56_cifar_activation_apoz_v2.yaml

Unverified

a0bf2a8f

Nov 24, 2018
- Fix un-handled exception traces showing twice in stdout · 30812b87
  Guy Jacob authored 6 years ago
  
  30812b87