Skip to content
Snippets Groups Projects
  1. Oct 31, 2019
  2. Jul 22, 2019
    • Guy Jacob's avatar
      Fix non 1:1 mapping between model w. ModuleList and SummaryGraph (#328) · b614330c
      Guy Jacob authored
      The PyTorch trace mechanism doesn't "see" torch.nn.ModuleList modules
      (since they don't have a forward function). As a result, the mapping
      from module names at the Python model definition level to the
      scope-names at the trace level is not 1:1. This makes it impossible for
      us to map back from SummaryGraph ops to their respective nn.Modules,
      which is required for flows like BatchNorm folding and stats fusion in
      post-training quantization.
      
      In #313 we handled this issue specifically in DistillerLSTM, but it
      makes much more sense to have a generic and automatic solution for this
      issue, which doesn't require the user to modify the model. This is such
      a solution.
          
      * Implemented DistillerModuleList, a replacement for nn.ModuleList
        which results in full and unique scope-names
      * See documentation for this class in summary_graph.py for extensive
        details on the issue and solution
      * When generating a SummaryGraph, the model is scanned and all instances
        of torch.nn.ModuleList are replaced with DistillerModulelist
      * Add tests for new functionality
      * Partially revert changes made to DistillerLSTM in commit 43548deb:
        Keep the refactored _create_cells_list function, but have it create
        a standard torch.nn.ModuleList (since we're the ModuleList issue
        automatically now, and no need to confuse users with ad-hoc list 
        implementations
      b614330c
  3. Jul 10, 2019
    • Guy Jacob's avatar
      Post-Train Quantization: BN folding and "net-aware quantization" (#313) · 43548deb
      Guy Jacob authored
      * "Net-aware quantization" - using the term coined in
        https://arxiv.org/abs/1811.09886. (section 3.2.2).
        Refers to considering sequences of modules when quantizing. This 
        isn't exactly layer fusion - we modify activation stats prior to
        setting quantization parameters, to make sure that when a module
        is followed by certain activation functions, only the relevant
        ranges are quantized. We do this for:
          * ReLU - Clip all negative values
          * Tanh / Sigmoid - Clip according to the (approximated) saturation
            values for these functions. We use [-4, 4] for tanh and [-6, 6]
            for sigmoid.
      
      * Perform batch-norm folding before post-training quantization.
        Batch-norm parameters are folded into the parameters of the previous
        layer and the BN layer is replaced with an identity module.
      
      * Both BN folding and "net-aware" are now automatically executed
        in PostTrainLinearQuantizer (details of this change below)
      
      * BN folding enabled by new generic mechanism to "fuse" module
        sequences (at the Python API level)
          * First module in sequence is replaced/modified by a user-provided
            function, rest of moudles replaced with nn.Identity
      
      * Quantizer changes:
        * Optionally create adjacency map during prepare_model
        * Subclasses may enforce adjacency map creation
        * Refatcoring: Replace _prepare_model_impl with pre and post
          override-able "callbacks", so core functionality is always executed
      
      * PostTrainLinearQuantizer Changes:
        * Enforce creation of adjacency map. This means users must now pass a
          dummy input to PostTrainLinearQuantizer.prepare_model
        * Before module replacement - Apply BN folding and stats updates according
          to net-aware quantization
      
      * Updated the language model quantization tutorial to reflect the new
        functionality
      
      * Updated the image classification post-train quantization samples
        (command line and YAML)
      
      * Other changes:
        * Distller LSTM implementation:
          Replace the ModuleList for cells with a plain list. The PyTorch trace
          mechanism doesn't "see" ModuleList objects, it only sees the 
          contained modules. This means that the "scopeName" of these modules
          isn't complete, which makes it impossible to match op names in 
          SummaryGraph to modules in the Python model.
        * ActivationStatsCollector: Ignore nn.Identity modules
      43548deb
  4. Apr 16, 2019
    • Lev Zlotnik's avatar
      LSTM: Modular implementation + Post-Train Quantization Sample (#196) · a3c8d86f
      Lev Zlotnik authored
      * Introduce a modular, Python-level implementation of LSTM/LSTMCell
        using existing PyTorch nn.Modules as building blocks
      * This allows quantization of weights and internal activations of
        LSTM layers using the existing Quantizer. 
        (In the PyTorch implementation of RNN/LSTM only the weights are 
        exposed at the Python level, whereas the internal activations are 
        "hidden" in C++ code.)
      * Supports stacked (multi-layer) and bi-directional LSTM
      * Implemented conversion functions from PyTorch LSTM module to
        our LSTM module and vice-versa
      * Tests for modular implementation correctness and for conversions
      * Jupyter notebook showing post-training quantization of a language
        model
      a3c8d86f
  5. Apr 08, 2019
  6. Mar 12, 2019
  7. Mar 05, 2019
  8. Feb 26, 2019
  9. Aug 09, 2018
    • Guy Jacob's avatar
      Generalize the loss value returned from before_backward_pass callbacks (#38) · a43b9f10
      Guy Jacob authored
      * Instead of a single additive value (which so far represented only the
        regularizer loss), callbacks return a new overall loss
      * Policy callbacks also return the individual loss components used to
        calculate the new overall loss.
      * Add boolean flag to the Scheduler's callback so applications can choose
        if they want to get individual loss components, or just the new overall
        loss
      * In compress_classifier.py, log the individual loss components
      * Add test for the loss-from-callback flow
      a43b9f10
  10. Jul 13, 2018
    • Neta Zmora's avatar
      ADC (Automatic Deep Compression) example + features, tests, bug fixes (#28) · 718f777b
      Neta Zmora authored
      This is a merge of the ADC branch and master.
      ADC (using a DDPG RL agent to compress image classifiers) is still WiP and requires
      An unreleased version of Coach (https://github.com/NervanaSystems/coach).
      
      Small features in this commit:
      -Added model_find_module() - find module object given its name
      - Add channel ranking and pruning: pruning/ranked_structures_pruner.py
      - Add a CIFAR10 VGG16 model: models/cifar10/vgg_cifar.py
      - Thinning: change the level of some log messages – some of the messages were
      moved to ‘debug’ level because they are not usually interesting.
      - Add a function to print nicely formatted integers - distiller/utils.py
      - Sensitivity analysis for channels-removal
      - compress_classifier.py – handle keyboard interrupts
      - compress_classifier.py – fix re-raise of exceptions, so they maintain call-stack
      
      -Added tests:
      -- test_summarygraph.py: test_simplenet() - Added a regression test to target a bug that occurs when taking the predecessor of the first node in a graph
      -- test_ranking.py - test_ch_ranking, test_ranked_channel_pruning
      -- test_model_summary.py - test_png_generation, test_summary (sparsity/ compute/model/modules)
      
      - Bug fixes in this commit:
      -- Thinning bug fix: handle zero-sized 'indices' tensor
      During the thinning process, the 'indices' tensor can become zero-sized,
      and will have an undefiend length. Therefore, we need to check for this
      situation when assessing the number of elements in 'indices'
      -- Language model: adjust main.py to new distiller.model_summary API
      718f777b
  11. Jun 21, 2018
  12. Jun 14, 2018
  13. Jun 13, 2018
    • Neta Zmora's avatar
      Language model: replace the optimizer and LR-decay scheduler · a9b28923
      Neta Zmora authored
      Replace the original "homebrew" optimizer and LR-decay schedule with
      PyTorch's SGD and ReduceLROnPlateau.
      SGD with momentum=0 and weight_decay=0, and ReduceLROnPlateau with
      patience=0 and factor=0.5 will give the same behavior as in the
      original PyTorch example.
      
      Having a standard optimizer and LR-decay schedule gives us the
      flexibility to experiment with these during the training process.
      a9b28923
  14. Jun 07, 2018
Loading