Skip to content
Snippets Groups Projects
  1. Mar 17, 2019
  2. Mar 12, 2019
  3. Mar 06, 2019
    • Neta Zmora's avatar
      compress_classifier.py: sort best scores by count of NNZ weights · 9cb0dd68
      Neta Zmora authored
      A recent commit changed the sorting of the best performing training
      epochs to be based on the sparsity level of the model, then its
      Top1 and Top5 scores.
      When we create thinned models, the sparsity remains low (even zero),
      while the physical size of the network is smaller.
      This commit changes the sorting criteria to be based on the count
      of non-zero (NNZ) parameters.  This captures both sparsity and
      parameter size objectives:
      - When sparsity is high, the number of NNZ params is low
      (params_nnz_cnt = sparsity * params_cnt).
      - When we remove structures (thinnning), the sparsity may remain
      constant, but the count of params (params_cnt) is lower, and therefore,
      once again params_nnz_cnt is lower.
      
      Therefore, params_nnz_cnt is a good proxy to capture a sparsity
      objective and/or a thinning objective.
      9cb0dd68
  4. Mar 03, 2019
    • Neta Zmora's avatar
      compress_classifier.py: Fix best_epoch logic · 87055fed
      Neta Zmora authored
      Based on a commit and ideas from @barrh:
      https://github.com/NervanaSystems/distiller/pull/150/commits/1623db3cdc3a95ab620e2dc6863cff23a91087bd
      
      The sample application compress_classifier.py logs details about
      the best performing epoch(s) and stores the best epoch in a checkpoint
      file named ```best.pth.tar``` by default (if you use the ```--name```
      application argument, the checkpoint name will be prefixed by ```best```).
      
      Until this fix, the performance of a model was judged solely on its
      Top1 accuracy.  This can be a problem when performing gradual pruning
      of a pre-trained model, because many times a model's Top1 accuracy
      increases with light pruning and this is registered as the best performing
      training epoch.  However, we are really interested in the best performing
      trained model _after_ the pruning phase is done.  Even during training, we
      may be interested in the checkpoint of the best performing model with the
      highest sparsity.
      This fix stores a list of the performance results from all the trained
      epochs so far.  This list is sorted using a hierarchical key:
      (sparsity, top1, top5, epoch), so that the list is first sorted by sparsity,
      then top1, followed by top5 and epoch.
      
      But what if you want to sort using a different metric?  For example, when
      quantizing you may want to score the best performance by the total number of
      bits used to represent the model parameters and feature-maps.  In such a case
      you may want to replace ```sparsity``` by this new metric.  Because this is a
      sample application, we don't load it with all possible control logic, and
      anyone can make local changes to this logic.  To keep your code separated from
      the main application logic, we plan to refactor the application code sometime
      in the next few months.
      87055fed
    • Neta Zmora's avatar
      compress_classifier.py: fix PNG and ONNX exports broken in new release · 6567ecec
      Neta Zmora authored
      Release 0.3 broke the expots to PNG and ONNX and this is the fix.
      6567ecec
  5. Feb 28, 2019
  6. Feb 26, 2019
  7. Feb 17, 2019
  8. Feb 14, 2019
    • Bar's avatar
      Store config files in logdir/configs directory (#156) · b476d028
      Bar authored
      Modified log_execution_env_state() to store
      configuration file in the output directory,
      under 'configs' sub-directory it creates.
      
      At this time, the only configuration file is
      passed via args.compress
      b476d028
    • Neta Zmora's avatar
      Fix automated-compression imports · ac9f61c0
      Neta Zmora authored
      To use automated compression you need to install several optional packages
      which are not required for other use-cases.
      This fix hides the import requirements for users who do not want to install
      the extra packages.
      ac9f61c0
  9. Feb 13, 2019
  10. Feb 11, 2019
    • Guy Jacob's avatar
      Post-train quant based on stats + additional modules quantized (#136) · 28a8ee18
      Guy Jacob authored
      Summary of changes:
      (1) Post-train quantization based on pre-collected statistics
      (2) Quantized concat, element-wise addition / multiplication and embeddings
      (3) Move post-train quantization command line args out of sample code
      (4) Configure post-train quantization from YAML for more fine-grained control
      
      (See PR #136 for more detailed changes descriptions)
      28a8ee18
  11. Feb 10, 2019
    • Guy Jacob's avatar
      Load different random subset of dataset on each epoch (#149) · 4b1d0c89
      Guy Jacob authored
      * For CIFAR-10 / ImageNet only
      * Refactor data_loaders.py, reduce code duplication
      * Implemented custom sampler
      * Integrated in image classification sample
      * Since we now shuffle the test set, had to update expected results
        in 2 full_flow_tests that do evaluation
      4b1d0c89
  12. Jan 31, 2019
  13. Jan 16, 2019
    • Bar's avatar
      compress_classifier.py refactoring (#126) · cfbc3798
      Bar authored
      * Support for multi-phase activations logging
      
      Enable logging activation both durning training and validation at
      the same session.
      
      * Refactoring: Move parser to its own file
      
      * Parser is moved from compress_classifier into its own file.
      * Torch version check is moved to precede main() call.
      * Move main definition to the top of the file.
      * Modify parser choices to case-insensitive
      cfbc3798
  14. Jan 15, 2019
  15. Jan 13, 2019
  16. Jan 10, 2019
    • Gal Novik's avatar
      Enable compute (training/inference) on the CPU · 007b6903
      Gal Novik authored
      In compress_classifier.py we added a new application argument: --cpu
      which you can use to force compute (training/inference) to run on the CPU 
      when you invoke compress_classifier.py on a machine which has Nvidia GPUs.
      
      If your machine lacks Nvidia GPUs, then the compute will now run on the CPU
      (and you do not need the new flag).
      
      Caveat: we did not fully test the CPU support for the code in the Jupyter 
      notebooks.  If you find a bug, we apologize and appreciate your feedback.
      007b6903
  17. Dec 19, 2018
  18. Dec 16, 2018
  19. Dec 14, 2018
    • Neta Zmora's avatar
      AMC: more refactoring · 1ab288ae
      Neta Zmora authored
      Added notebook for visualizing the discovery of compressed networks.
      Added one-epoch fine-tuning at the end of every episode, which is
      required for very sensitive models like Plain20.
      1ab288ae
  20. Dec 11, 2018
  21. Dec 06, 2018
  22. Dec 04, 2018
  23. Dec 01, 2018
    • Neta Zmora's avatar
      Important changes to pruning channels and filters (#93) · a0bf2a8f
      Neta Zmora authored
      This commit contains the main fix for issue #85.  It contains a couple of changes to the YAML structure pruning API, with examples.
      I urge you to read the documentation in the Wiki (https://github.com/NervanaSystems/distiller/wiki/Pruning-Filters-&-Channels).
      
      New syntax for defining Structured AGP.  I tried to make the syntax similar to fine-grained
      (i.e. element-wise) pruning.  All you need to do is add: ```group_type: Filters```.
      ```
        low_pruner:
          class: L1RankedStructureParameterPruner_AGP
          initial_sparsity : 0.10
          final_sparsity: 0.50
          group_type: Filters
          weights: [module.layer3.0.conv2.weight,
                    module.layer3.0.downsample.0.weight,
                    module.layer3.1.conv2.weight,
                    module.layer3.2.conv2.weight]
      ```
      
      If you want to define “leader-based” pruning dependencies, add ```group_dependency: Leader```:
      ```
        low_pruner:
          class: L1RankedStructureParameterPruner_AGP
          initial_sparsity : 0.10
          final_sparsity: 0.50
          group_type: Filters
          group_dependency: Leader
          weights: [module.layer3.0.conv2.weight,
                    module.layer3.0.downsample.0.weight,
                    module.layer3.1.conv2.weight,
                    module.layer3.2.conv2.weight]
      ```
      
      Retired the old ```reg_regims``` API for describing one-shot structured-pruning.
      
      The new YAML API is very similar to AGP structured-pruning, which is much better
      than before.
      The new API also allows us to describe data-dependencies when doing one-shot
      structure pruning, just like AGP structured-pruning.
      
      This commit also includes further code refactoring.
      
      Old API:
      ```
        filter_pruner:
           class: 'L1RankedStructureParameterPruner'
           reg_regims:
             'module.layer1.0.conv1.weight': [0.6, '3D']
             'module.layer1.1.conv1.weight': [0.6, '3D']
      ```
      
      New API:
      ```
       filter_pruner:
          class: 'L1RankedStructureParameterPruner'
          group_type: Filters
          desired_sparsity: 0.6
          weights: [
            module.layer1.0.conv1.weight,
            module.layer1.1.conv1.weight]
      ```
      
      thresholding.py – separate the generation of the binary_map from the pruning_mask so that we
      can cache the binary map and share it between several modules.
      
      pruning/automated_gradual_pruner.py – major refactoring to supported “leader-based”
      sub-graph pruning dependencies.  The concept is explained in issue #85
      
      
      agp-pruning/resnet20_filters.schedule_agp.yaml
      agp-pruning/resnet20_filters.schedule_agp_2.yaml
      agp-pruning/resnet20_filters.schedule_agp_3.yaml
      network_trimming/resnet56_cifar_activation_apoz.yaml
      network_trimming/resnet56_cifar_activation_apoz_v2.yaml
      a0bf2a8f
  24. Nov 24, 2018
  25. Nov 22, 2018
    • Neta Zmora's avatar
      Fix Issue 79 (#81) · acbb4b4d
      Neta Zmora authored
      * Fix issue #79
      
      Change the default values so that the following scheduler meta-data keys
      are always defined: 'starting_epoch', 'ending_epoch', 'frequency'
      
      * compress_classifier.py: add a new argument
      
      Allow the specification, from the command line arguments,  of the range of
      pruning levels scanned when doing sensitivity analysis
      
      * Add regression test for issue #79
      acbb4b4d
  26. Nov 21, 2018
  27. Nov 20, 2018
    • Neta Zmora's avatar
      Bug fix: value of best_top1 stored in the checkpoint may be wrong (#77) · 6242afed
      Neta Zmora authored
      * Bug fix: value of best_top1 stored in the checkpoint may be wrong
      
      If you invoke compress_clasifier.py with --num-best-scores=n
      with n>1, then the value of best_top1 stored in checkpoints is wrong.
      6242afed
    • Neta Zmora's avatar
      Bug fix: Resuming from checkpoint ignored the masks stored in the checkpoint (#76) · 78e98a51
      Neta Zmora authored
      When we resume from a checkpoint, we usually want to continue using the checkpoint’s
      masks.  I say “usually” because I can see a situation where we want to prune a model
      and checkpoint it, and then resume with the intention of fine-tuning w/o keeping the
      masks.  This is what’s done in Song Han’s Dense-Sparse-Dense (DSD) training
      (https://arxiv.org/abs/1607.04381).  But I didn’t want to add another argument to
      ```compress_classifier.py``` for the time being – so we ignore DSD.
      
      There are two possible situations when we resume a checkpoint that has a serialized
      ```CompressionScheduler``` with pruning masks:
      1. We are planning on using a new ```CompressionScheduler``` that is defined in a
      schedule YAML file.  In this case, we want to copy the masks from the serialized
      ```CompressionScheduler``` to the new ```CompressionScheduler``` that we are
      constructing from the YAML file.  This is one fix.
      2. We are resuming a checkpoint, but without using a YAML schedule file.
      In this case we want to use the ```CompressionScheduler``` that we loaded from the
      checkpoint file.  All this ```CompressionScheduler``` does is keep applying the masks
      as we train, so that we don’t lose them.  This is the second fix.
      
      For DSD, we would need a new flag that would override using the ```CompressionScheduler```
      that we load from the checkpoint.
      78e98a51
  28. Nov 08, 2018
  29. Nov 06, 2018
  30. Nov 05, 2018
    • Neta Zmora's avatar
      Dynamic Network Surgery (#69) · 60a4f44a
      Neta Zmora authored
      Added an implementation of:
      
      Dynamic Network Surgery for Efficient DNNs, Yiwen Guo, Anbang Yao, Yurong Chen.
      NIPS 2016, https://arxiv.org/abs/1608.04493.
      
      - Added SplicingPruner: A pruner that both prunes and splices connections.
      - Included an example schedule on ResNet20 CIFAR.
      - New features for compress_classifier.py:
         1. Added the "--masks-sparsity" which, when enabled, logs the sparsity
            of the weight masks during training.
        2. Added a new command-line argument to report the top N
            best accuracy scores, instead of just the highest score.
            This is sometimes useful when pruning a pre-trained model,
            that has the best Top1 accuracy in the first few pruning epochs.
      - New features for PruningPolicy:
         1. The pruning policy can use two copies of the weights: one is used during
             the forward-pass, the other during the backward pass.
             This is controlled by the “mask_on_forward_only” argument.
         2. If we enable “mask_on_forward_only”, we probably want to permanently apply
             the mask at some point (usually once the pruning phase is done).
             This is controlled by the “keep_mask” argument.
         3. We introduce a first implementation of scheduling at the training-iteration
             granularity (i.e. at the mini-batch granularity). Until now we could schedule
             pruning at the epoch-granularity. This is controlled by the “mini_batch_pruning_frequency”
             (disable by setting to zero).
      
         Some of the abstractions may have leaked from PruningPolicy to CompressionScheduler.
         Need to reexamine this in the future.
      60a4f44a
  31. Nov 01, 2018
  32. Oct 22, 2018
    • Neta Zmora's avatar
      Activation statistics collection (#61) · 54a5867e
      Neta Zmora authored
      Activation statistics can be leveraged to make pruning and quantization decisions, and so
      We added support to collect these data.
      - Two types of activation statistics are supported: summary statistics, and detailed records 
      per activation.
      Currently we support the following summaries: 
      - Average activation sparsity, per layer
      - Average L1-norm for each activation channel, per layer
      - Average sparsity for each activation channel, per layer
      
      For the detailed records we collect some statistics per activation and store it in a record.  
      Using this collection method generates more detailed data, but consumes more time, so
      Beware.
      
      * You can collect activation data for the different training phases: training/validation/test.
      * You can access the data directly from each module that you chose to collect stats for.  
      * You can also create an Excel workbook with the stats.
      
      To demonstrate use of activation collection we added a sample schedule which prunes 
      weight filters by the activation APoZ according to:
      "Network Trimming: A Data-Driven Neuron Pruning Approach towards 
      Efficient Deep Architectures",
      Hengyuan Hu, Rui Peng, Yu-Wing Tai, Chi-Keung Tang, ICLR 2016
      https://arxiv.org/abs/1607.03250
      
      We also refactored the AGP code (AutomatedGradualPruner) to support structure pruning,
      and specifically we separated the AGP schedule from the filter pruning criterion.  We added
      examples of ranking filter importance based on activation APoZ (ActivationAPoZRankedFilterPruner),
      random (RandomRankedFilterPruner), filter gradients (GradientRankedFilterPruner), 
      and filter L1-norm (L1RankedStructureParameterPruner)
      54a5867e
Loading