Skip to content
Snippets Groups Projects
  1. Nov 13, 2019
    • Bar's avatar
      image_classifier.py: PTQ stats collection and eval in same run (#346) · fb98377e
      Bar authored
      * Previous implementation:
        * Stats collection required a separate run with `-qe-calibration`.
        * Specifying `--quantize-eval` without `--qe-stats-file` triggered
          dynamic quantization.
        * Running with `--quantize-eval --qe-calibration <num>` only ran
          stats collection and ignored --quantize-eval.
      
      * New implementation:
        * Running `--quantize-eval --qe-calibration <num>` will now 
          perform stats collection according to the calibration flag,
          and then quantize the model with the collected stats (and
          run evaluation).
        * Specifying `--quantize-eval` without `--qe-stats-file` will
          trigger the same flow as in the bullet above, as if 
          `--qe-calibration 0.05` was used (i.e. 5% of the test set will
          be used for stats).
        * Added new flag: `--qe-dynamic`. From now, to do dynamic 
          quantization, need to explicitly run:
          `--quantize-eval --qe-dynamic`
        * As before, can still run `--qe-calibration` without 
          `--quantize-eval` to perform "stand-alone" stats collection
        * The following flags, which all represent different ways to
          control creation of stats or use of existing stats, are now
          mutually exclusive:
          `--qe-calibration`, `-qe-stats-file`, `--qe-dynamic`,
          `--qe-config-file`
      fb98377e
  2. Nov 07, 2019
    • Neta Zmora's avatar
      Fix Early-exit code · fc62caab
      Neta Zmora authored
      Fix the EE code so that it works with the current 'master' branch,
      and add a test for high-level EE regression
      fc62caab
  3. Nov 06, 2019
  4. Oct 23, 2019
  5. Sep 18, 2019
    • Neta Zmora's avatar
      compress_classifier.py: remove remarked code · f6c48f87
      Neta Zmora authored
      f6c48f87
    • Neta Zmora's avatar
      Odds and ends commit · 8d55ab15
      Neta Zmora authored
      A bundle of very small, and mostly non-functional, changes to the code.
      Mostly they are unrelated to each other
      ../../../distiller/apputils/checkpoint.py – add info to exception
      
      ../../../distiller/apputils/image_classifier.py – remove unused
      `--extras` command-line argument
      
      ../../../distiller/thinning.py – code refactoring (non-functional)
      except for adding a new public API: contract_model()
      
      ../../classifier_compression/compress_classifier.py – use
      contract_model() when using `--thinnify`
      
      ../../lottery_ticket/README.md – remove illegal characters in
      the text
      8d55ab15
  6. Aug 28, 2019
  7. Aug 26, 2019
    • Neta Zmora's avatar
      Lottery Ticket Hypothesis · 78e2e4c7
      Neta Zmora authored
      Added support for saving the randomly initialized network before
      starting training; and added an implmentation showing how to extract
      a (winning) lottery ticket from the prestine network, and the
      pruned network.
      78e2e4c7
  8. Aug 11, 2019
  9. Aug 06, 2019
    • Neta Zmora's avatar
      AMC and other refactoring - large merge (#339) · 02054da1
      Neta Zmora authored
      *An implementation of AMC (the previous implementation
       code has moved to a new location under 
      /distiller/examples/auto_compression/amc.  AMC is aligned
      with the ‘master’ branch of Coach.
      *compress_classifier.py is refactored.  The base code moved
      to /distiller/apputils/image_classifier.py.  Further refactoring
      will follow.
      We want to provide a simple and small API to the basic features of
      a classifier-compression application.
      This will help applications that want to use the make features of a
      classifier-compression application, without the standard training
      regiment.
      AMC is one example of a stand-alone application that needs to leverage
      the capabilities of a classifier-compression application, but is currently
      coupled to `compress_classifier.py`.
      `multi-finetune.py` is another example.
      * ranked_structures_pruner.py:
      ** Added support for grouping channels/filters
      Sometimes we want to prune a group of structures: e.g. groups of
      8-channels.  This feature does not force the groups to be adjacent,
      so it is more like a set of structures.  E.g. in the case of pruning
      channels from a 64-channels convolution, grouped by 8 channels, we 
      will prune exactly one of 0/8/16/24/32/40/48/56 channels.  I.e. 
      always a multiple of 8-channels, excluding the set of all 64 channels.
      ** Added FMReconstructionChannelPruner – this is channel
      pruning using L1-magnitude to rank and select channels to
      remove, and feature-map reconstruction to improve the
      resilience to the pruning.
      * Added a script to run multiple instances of an 
      experiment, in different processes:
       examples/classifier_compression/multi-run.py
      * Set the seed value even when not specified by the command-line
      arguments, so that we can try and recreate the session.
      * Added pruning ranking noise -
      Ranking noise introduces Gaussian noise when ranking channels/filters
      using Lp-norm.  The noise is introduced using the epsilon-greedy
      methodology, where ranking using exact Lp-norm is considered greedy.
      * Added configurable rounding of pruning level: choose whether to 
      Round up/down when rounding the number of structures to prune 
      (rounding is always to an integer).  
      02054da1
  10. Jul 10, 2019
    • Guy Jacob's avatar
      Post-Train Quantization: BN folding and "net-aware quantization" (#313) · 43548deb
      Guy Jacob authored
      * "Net-aware quantization" - using the term coined in
        https://arxiv.org/abs/1811.09886. (section 3.2.2).
        Refers to considering sequences of modules when quantizing. This 
        isn't exactly layer fusion - we modify activation stats prior to
        setting quantization parameters, to make sure that when a module
        is followed by certain activation functions, only the relevant
        ranges are quantized. We do this for:
          * ReLU - Clip all negative values
          * Tanh / Sigmoid - Clip according to the (approximated) saturation
            values for these functions. We use [-4, 4] for tanh and [-6, 6]
            for sigmoid.
      
      * Perform batch-norm folding before post-training quantization.
        Batch-norm parameters are folded into the parameters of the previous
        layer and the BN layer is replaced with an identity module.
      
      * Both BN folding and "net-aware" are now automatically executed
        in PostTrainLinearQuantizer (details of this change below)
      
      * BN folding enabled by new generic mechanism to "fuse" module
        sequences (at the Python API level)
          * First module in sequence is replaced/modified by a user-provided
            function, rest of moudles replaced with nn.Identity
      
      * Quantizer changes:
        * Optionally create adjacency map during prepare_model
        * Subclasses may enforce adjacency map creation
        * Refatcoring: Replace _prepare_model_impl with pre and post
          override-able "callbacks", so core functionality is always executed
      
      * PostTrainLinearQuantizer Changes:
        * Enforce creation of adjacency map. This means users must now pass a
          dummy input to PostTrainLinearQuantizer.prepare_model
        * Before module replacement - Apply BN folding and stats updates according
          to net-aware quantization
      
      * Updated the language model quantization tutorial to reflect the new
        functionality
      
      * Updated the image classification post-train quantization samples
        (command line and YAML)
      
      * Other changes:
        * Distller LSTM implementation:
          Replace the ModuleList for cells with a plain list. The PyTorch trace
          mechanism doesn't "see" ModuleList objects, it only sees the 
          contained modules. This means that the "scopeName" of these modules
          isn't complete, which makes it impossible to match op names in 
          SummaryGraph to modules in the Python model.
        * ActivationStatsCollector: Ignore nn.Identity modules
      43548deb
  11. Jul 04, 2019
    • Guy Jacob's avatar
      Switch to PyTorch 1.1.0 (#306) · 032b1f74
      Guy Jacob authored
      * PyTorch 1.1.0 now required
        - Moved other dependencies to up-to-date versions as well
      * Adapt LR scheduler to PyTorch 1.1 API changes:
        - Change lr_scheduler.step() calls to succeed validate calls,
          during training
        - Pass to lr_scheduler.step() caller both loss and top1
          (Resolves issue #240)
      * Adapt thinning for PyTorch 1.1 semantic changes
        - **KNOWN ISSUE**: When a thinning recipe is applied, in certain
          cases PyTorch displays this warning:
          "UserWarning: non-inplace resize is deprecated".
          To be fixed later
      * SummaryGraph: Workaround for new scope name issue from PyTorch 1.1.0
      * Adapt to updated PyTest version:
        - Stop using deprecated 'message' parameter of pytest.raises(),
          use pytest.fail() instead
        - Make sure only a single test case per pytest.raises context
      * Move PyTorch version check to root __init__.py 
        - This means the version each checked when Distiller is first
          imported. A RuntimeError is raised if the version is wrong.
      * Updates to parameter_histograms notebook:
        - Replace deprecated normed argument with density
        - Add sparsity rate to plot title
        - Load model in CPU
      032b1f74
  12. Jul 03, 2019
  13. Jul 01, 2019
  14. May 29, 2019
  15. May 26, 2019
    • Neta Zmora's avatar
      Added support for setting the PRNG seed (#269) · fe27ab90
      Neta Zmora authored
      Added set_seed() to Distiller and added support for seeding the PRNG when setting --deterministic mode (prior to this change, the seed is always set to zero when running in deterministic mode.
      The PRNGs of Pytorch (CPU & Cuda devices), numpy and Python are set.
      Added support for ```--seed``` to classifier_compression.py.
      fe27ab90
  16. May 21, 2019
    • Bar's avatar
      Store qe_stats_file in output directory (#262) · 31536873
      Bar authored
      Function log_execution_env_state is used to gather information about the execution environment and store this together with the experiment log.  Recently we've added saving the compression schedule YAML file in the same logs directory.
      This commit expands the log_execution_env_state interface to accept a list of paths to arbitrary files that may contribute to the experiment configuration and that you (the experiment owner) deem important for recreating the experiment.
      
      In the sample classifier_compression.py app, we now store both the compression schedule YAML file and quantization statistics collateral file (qe_stats_file).
      31536873
  17. May 16, 2019
    • Neta Zmora's avatar
      compress_classifier.py: --summary related fixes · 2ef3eeb6
      Neta Zmora authored
      The previous PR merge introduced a couple of small errors when
      using the --summary flag.
      2ef3eeb6
    • Bar's avatar
      Refactor export to ONNX functionality (#258) · 54304810
      Bar authored
      Introduced a new utility function to export image-classifiers
      to ONNX: export_img_classifier_to_onnx.
      The functionality is not new, just refactored.
      
      In the sample application compress_classifier.py added 
      --export-onnx as a stand-alone cmd-line flag for specifically exporting 
      ONNX models.
      This new flag can take an optional argument which is used to name the
      exported onnx model file.
      The option to export models was removed from the –summary argument.
      Now we allow multiple --summary options be called together.
      
      Added a basic test for exporting ONNX.
      54304810
  18. May 15, 2019
    • Guy Jacob's avatar
      Activation Histograms (#254) · 9405679f
      Guy Jacob authored
      Added a collector for activation histograms (sub-class of
      ActivationStatsCollector). It is stats-based, meaning it requires
      pre-computed min/max stats per tensor. This is done in order to prevent
      the need to save all of the activation tensors throughout the run.
      The stats are expected in the format generated by
      QuantCalibrationStatsCollector.
      
      Details:
      
      * Implemented ActivationHistogramsCollector
      * Added Jupyter notebook showcasing activation histograms
      * Implemented helper function that performs the stats collection pass
        and histograms pass in one go
      * Also added separate helper function just for quantization stats
        collection
      * Integrated in image classification sample
      * data_loaders.py: Added option to have a fixed subset throughout
        within the same session. Using it to keep the same subset between
        the stats collection and histograms collection phases.
      * Other changes:
        * Calling assign_layer_fq_names in base-class of collectors. We do
          this since the collectors, as implemented so far, assume this is
          done. So makes sense to just do it in the base class instead of
          expecting the user to do it.
        * Enforcing a non-parallel model for quantization stats and
          histograms collectors
        * Jupyter notebooks - add utility function to enable loggers in
          notebooks. This allows us to see any logging done by Distiller
          APIs called from notebooks.
      9405679f
  19. May 14, 2019
  20. Apr 18, 2019
  21. Apr 11, 2019
  22. Apr 08, 2019
    • Neta Zmora's avatar
      Refine pruning logic (#222) · 816a943d
      Neta Zmora authored
      Add finer control over the pruning logic, to accommodate more pruning
      use-cases.
      The full description of the new logic is available in the updated [documentation
      of the CompressionScheduler](https://nervanasystems.github.io/distiller/schedule.html#pruning-fine-control), which is also part of this PR.
      
      In this PR:
      
      * Added a new callback to the CompressionScheduler:
      compression_scheduler.before_parameter_optimization which is invoked
      after the gradients are are computed, but before the weights are updated
      by the optimizer.
      
      * We provide an option to mask the gradients, before the weights are updated by the optimizer. 
      We register to the parameter backward hook in order to mask the gradients.
      This gives us finer control over the parameter updates.
      
      * Added several DropFilter schedules.
      DropFilter is a method to regularize networks, and it can also be
      used to "prepare" a network for permanent filter pruning.
      
      *Added documentation of pruning fine-control
      816a943d
  23. Apr 01, 2019
    • Bar's avatar
      Load optimizer from checkpoint (BREAKING - see details) (#182) · 992291cf
      Bar authored
      Load optimizer from checkpoint (BREAKING - see details) (#182)
      
      * Fixes issues #70, #145 and replaces PR #74
      * checkpoint.py
        * save_checkpoint will now save the optimizer type in addition to
          its state
        * load_checkpoint will now instantiate an optimizer based on the
          saved type and load its state
      * config.py: file/dict_config now accept the resumed epoch to pass to
        LR schedulers
      * policy.py: LRPolicy now passes the current epoch to the LR scheduler
      * Classifier compression sample
        * New flag '--resume-from' for properly resuming a saved training
          session, inc. optimizer state and epoch #
        * Flag '--reset-optimizer' added to allow discarding of a loaded
          optimizer.
        * BREAKING:
          * Previous flag '--resume' is deprecated and is mapped to
            '--resume-from' + '--reset-optimizer'. 
          * But, old resuming behavior had an inconsistency where the epoch
            count would continue from the saved epoch, but the LR scheduler
            was setup as if we were starting from epoch 0.
          * Using '--resume-from' + '--reset-optimizer' now will simply
            RESET the epoch count to 0 for the whole environment.
          * This means that scheduling configurations (in YAML or code)
            which assumed use of '--resume' might need to be changed to
            reflect the fact that the epoch count now starts from 0
          * All relevant YAML files under 'examples' modified to reflect
            this change
      * Initial support for ReduceLROnPlateau (#161):
        * Allow passing **kwargs to policies via the scheduler
        * Image classification now passes the validation loss to the
          scheduler, to be used yo ReduceLROnPlateau
        * The current implementation is experimental and subject to change
      992291cf
  24. Mar 17, 2019
  25. Mar 12, 2019
  26. Mar 06, 2019
    • Neta Zmora's avatar
      compress_classifier.py: sort best scores by count of NNZ weights · 9cb0dd68
      Neta Zmora authored
      A recent commit changed the sorting of the best performing training
      epochs to be based on the sparsity level of the model, then its
      Top1 and Top5 scores.
      When we create thinned models, the sparsity remains low (even zero),
      while the physical size of the network is smaller.
      This commit changes the sorting criteria to be based on the count
      of non-zero (NNZ) parameters.  This captures both sparsity and
      parameter size objectives:
      - When sparsity is high, the number of NNZ params is low
      (params_nnz_cnt = sparsity * params_cnt).
      - When we remove structures (thinnning), the sparsity may remain
      constant, but the count of params (params_cnt) is lower, and therefore,
      once again params_nnz_cnt is lower.
      
      Therefore, params_nnz_cnt is a good proxy to capture a sparsity
      objective and/or a thinning objective.
      9cb0dd68
  27. Mar 03, 2019
    • Neta Zmora's avatar
      compress_classifier.py: Fix best_epoch logic · 87055fed
      Neta Zmora authored
      Based on a commit and ideas from @barrh:
      https://github.com/NervanaSystems/distiller/pull/150/commits/1623db3cdc3a95ab620e2dc6863cff23a91087bd
      
      The sample application compress_classifier.py logs details about
      the best performing epoch(s) and stores the best epoch in a checkpoint
      file named ```best.pth.tar``` by default (if you use the ```--name```
      application argument, the checkpoint name will be prefixed by ```best```).
      
      Until this fix, the performance of a model was judged solely on its
      Top1 accuracy.  This can be a problem when performing gradual pruning
      of a pre-trained model, because many times a model's Top1 accuracy
      increases with light pruning and this is registered as the best performing
      training epoch.  However, we are really interested in the best performing
      trained model _after_ the pruning phase is done.  Even during training, we
      may be interested in the checkpoint of the best performing model with the
      highest sparsity.
      This fix stores a list of the performance results from all the trained
      epochs so far.  This list is sorted using a hierarchical key:
      (sparsity, top1, top5, epoch), so that the list is first sorted by sparsity,
      then top1, followed by top5 and epoch.
      
      But what if you want to sort using a different metric?  For example, when
      quantizing you may want to score the best performance by the total number of
      bits used to represent the model parameters and feature-maps.  In such a case
      you may want to replace ```sparsity``` by this new metric.  Because this is a
      sample application, we don't load it with all possible control logic, and
      anyone can make local changes to this logic.  To keep your code separated from
      the main application logic, we plan to refactor the application code sometime
      in the next few months.
      87055fed
    • Neta Zmora's avatar
      compress_classifier.py: fix PNG and ONNX exports broken in new release · 6567ecec
      Neta Zmora authored
      Release 0.3 broke the expots to PNG and ONNX and this is the fix.
      6567ecec
  28. Feb 28, 2019
  29. Feb 26, 2019
  30. Feb 17, 2019
  31. Feb 14, 2019
    • Bar's avatar
      Store config files in logdir/configs directory (#156) · b476d028
      Bar authored
      Modified log_execution_env_state() to store
      configuration file in the output directory,
      under 'configs' sub-directory it creates.
      
      At this time, the only configuration file is
      passed via args.compress
      b476d028
    • Neta Zmora's avatar
      Fix automated-compression imports · ac9f61c0
      Neta Zmora authored
      To use automated compression you need to install several optional packages
      which are not required for other use-cases.
      This fix hides the import requirements for users who do not want to install
      the extra packages.
      ac9f61c0
  32. Feb 13, 2019
  33. Feb 11, 2019
    • Guy Jacob's avatar
      Post-train quant based on stats + additional modules quantized (#136) · 28a8ee18
      Guy Jacob authored
      Summary of changes:
      (1) Post-train quantization based on pre-collected statistics
      (2) Quantized concat, element-wise addition / multiplication and embeddings
      (3) Move post-train quantization command line args out of sample code
      (4) Configure post-train quantization from YAML for more fine-grained control
      
      (See PR #136 for more detailed changes descriptions)
      28a8ee18
  34. Feb 10, 2019
    • Guy Jacob's avatar
      Load different random subset of dataset on each epoch (#149) · 4b1d0c89
      Guy Jacob authored
      * For CIFAR-10 / ImageNet only
      * Refactor data_loaders.py, reduce code duplication
      * Implemented custom sampler
      * Integrated in image classification sample
      * Since we now shuffle the test set, had to update expected results
        in 2 full_flow_tests that do evaluation
      4b1d0c89
  35. Jan 31, 2019
Loading