Skip to content
Snippets Groups Projects
  1. Aug 04, 2019
  2. Jul 31, 2019
  3. Jul 29, 2019
    • Guy Jacob's avatar
      Add missing import (from commit 69b1452a) · 84d63200
      Guy Jacob authored
      84d63200
    • Guy Jacob's avatar
      DistillerModuleList conversion: Handle models w. duplicate modules (#338) · db531db8
      Guy Jacob authored
      * By duplicate modules we mean:
        self.relu1 = nn.Relu()
        self.relu2 = self.relu1
      * The issue:
        The second module ('relu2') will not be returned by
        torch.nn.Module.named_modules/children()
      * When converting to DistillerModuleList, in order to maintain the
        original order of modules and in order to have a correct mapping
        of names before/after the conversion - we need to take the duplicates
        into account
      * Implemented an internal version of named_modules/children that includes
        duplicates
      * Added test case for this + refactored the module list conversion tests
      Unverified
      db531db8
    • Guy Jacob's avatar
      Post-train quant: Special handling for bidirectional DistillerLSTM (#337) · 69b1452a
      Guy Jacob authored
      * For some reason, SummaryGraph generation is broken for DistillerLSTM
        modules with 'bidirectional' enabled. The ONNX graph optimization stage
        causes all the nodes from the bidirectional module to vanish from the
        graph (they're in the graph after the initial trace)
      * As a temporary workaround to enable stats fusion in post-train quant,
        if a bidirectional DistillerLSTM is detected, we just do a simple
        "hard-coded" fusion of the element-wise add op with the subsequent
        non-linearities and skip the automatic flow with SummaryGraph.
      Unverified
      69b1452a
  4. Jul 23, 2019
  5. Jul 22, 2019
    • Guy Jacob's avatar
      Fix non 1:1 mapping between model w. ModuleList and SummaryGraph (#328) · b614330c
      Guy Jacob authored
      The PyTorch trace mechanism doesn't "see" torch.nn.ModuleList modules
      (since they don't have a forward function). As a result, the mapping
      from module names at the Python model definition level to the
      scope-names at the trace level is not 1:1. This makes it impossible for
      us to map back from SummaryGraph ops to their respective nn.Modules,
      which is required for flows like BatchNorm folding and stats fusion in
      post-training quantization.
      
      In #313 we handled this issue specifically in DistillerLSTM, but it
      makes much more sense to have a generic and automatic solution for this
      issue, which doesn't require the user to modify the model. This is such
      a solution.
          
      * Implemented DistillerModuleList, a replacement for nn.ModuleList
        which results in full and unique scope-names
      * See documentation for this class in summary_graph.py for extensive
        details on the issue and solution
      * When generating a SummaryGraph, the model is scanned and all instances
        of torch.nn.ModuleList are replaced with DistillerModulelist
      * Add tests for new functionality
      * Partially revert changes made to DistillerLSTM in commit 43548deb:
        Keep the refactored _create_cells_list function, but have it create
        a standard torch.nn.ModuleList (since we're the ModuleList issue
        automatically now, and no need to confuse users with ad-hoc list 
        implementations
      Unverified
      b614330c
  6. Jul 21, 2019
  7. Jul 17, 2019
    • Guy Jacob's avatar
      Fix for issue #323 · 5c2e9f54
      Guy Jacob authored
      In execution_env.log_execution_env_state, use os.cpu_count
      if os.sched_getaffinity is not available
      5c2e9f54
  8. Jul 16, 2019
  9. Jul 10, 2019
    • Guy Jacob's avatar
      Update post-train quant command line example · 112163eb
      Guy Jacob authored
      Unverified
      112163eb
    • Guy Jacob's avatar
      Post-Train Quantization: BN folding and "net-aware quantization" (#313) · 43548deb
      Guy Jacob authored
      * "Net-aware quantization" - using the term coined in
        https://arxiv.org/abs/1811.09886. (section 3.2.2).
        Refers to considering sequences of modules when quantizing. This 
        isn't exactly layer fusion - we modify activation stats prior to
        setting quantization parameters, to make sure that when a module
        is followed by certain activation functions, only the relevant
        ranges are quantized. We do this for:
          * ReLU - Clip all negative values
          * Tanh / Sigmoid - Clip according to the (approximated) saturation
            values for these functions. We use [-4, 4] for tanh and [-6, 6]
            for sigmoid.
      
      * Perform batch-norm folding before post-training quantization.
        Batch-norm parameters are folded into the parameters of the previous
        layer and the BN layer is replaced with an identity module.
      
      * Both BN folding and "net-aware" are now automatically executed
        in PostTrainLinearQuantizer (details of this change below)
      
      * BN folding enabled by new generic mechanism to "fuse" module
        sequences (at the Python API level)
          * First module in sequence is replaced/modified by a user-provided
            function, rest of moudles replaced with nn.Identity
      
      * Quantizer changes:
        * Optionally create adjacency map during prepare_model
        * Subclasses may enforce adjacency map creation
        * Refatcoring: Replace _prepare_model_impl with pre and post
          override-able "callbacks", so core functionality is always executed
      
      * PostTrainLinearQuantizer Changes:
        * Enforce creation of adjacency map. This means users must now pass a
          dummy input to PostTrainLinearQuantizer.prepare_model
        * Before module replacement - Apply BN folding and stats updates according
          to net-aware quantization
      
      * Updated the language model quantization tutorial to reflect the new
        functionality
      
      * Updated the image classification post-train quantization samples
        (command line and YAML)
      
      * Other changes:
        * Distller LSTM implementation:
          Replace the ModuleList for cells with a plain list. The PyTorch trace
          mechanism doesn't "see" ModuleList objects, it only sees the 
          contained modules. This means that the "scopeName" of these modules
          isn't complete, which makes it impossible to match op names in 
          SummaryGraph to modules in the Python model.
        * ActivationStatsCollector: Ignore nn.Identity modules
      Unverified
      43548deb
  10. Jul 08, 2019
  11. Jul 04, 2019
    • Neta Zmora's avatar
      Bypass Torchvision MobileNet v2 ONNX-export bug · c0e45da2
      Neta Zmora authored
      A temporary monkey-patch to get past this Torchvision bug:
      https://github.com/pytorch/pytorch/issues/20516
      
      To trigger, try exporting mobilenet v2 to ONNX:
      time python3 compress_classifier.py --arch=mobilenet_v2 --pretrained ${IMAGENET_PATH} --export-onnx
      c0e45da2
    • Guy Jacob's avatar
      Switch to PyTorch 1.1.0 (#306) · 032b1f74
      Guy Jacob authored
      * PyTorch 1.1.0 now required
        - Moved other dependencies to up-to-date versions as well
      * Adapt LR scheduler to PyTorch 1.1 API changes:
        - Change lr_scheduler.step() calls to succeed validate calls,
          during training
        - Pass to lr_scheduler.step() caller both loss and top1
          (Resolves issue #240)
      * Adapt thinning for PyTorch 1.1 semantic changes
        - **KNOWN ISSUE**: When a thinning recipe is applied, in certain
          cases PyTorch displays this warning:
          "UserWarning: non-inplace resize is deprecated".
          To be fixed later
      * SummaryGraph: Workaround for new scope name issue from PyTorch 1.1.0
      * Adapt to updated PyTest version:
        - Stop using deprecated 'message' parameter of pytest.raises(),
          use pytest.fail() instead
        - Make sure only a single test case per pytest.raises context
      * Move PyTorch version check to root __init__.py 
        - This means the version each checked when Distiller is first
          imported. A RuntimeError is raised if the version is wrong.
      * Updates to parameter_histograms notebook:
        - Replace deprecated normed argument with density
        - Add sparsity rate to plot title
        - Load model in CPU
      Unverified
      032b1f74
    • Neta Zmora's avatar
      Model thinning bug fix · 947143a1
      Neta Zmora authored
      This bug is triggered (for example) when you execute this example code:
      python3 compress_classifier.py --arch resnet20_cifar  ../../../data.cifar10 -p=50 --lr=0.3 --epochs=180 --compress=../agp-pruning/resnet20_filters.schedule_agp_4.yaml --resume-from=../ssl/checkpoints/checkpoint_trained_dense.pth.tar --reset-optimizer --vs=0
      
      The root-cause is a (non-functional) code refactoring made in commit
      992291cf.
      
      The problem is in an `if` statement handling the Optimizer reshaping.
      The bad commit combined two `if` statements into one, and moved this
      combined statement to a place where the control flows thru it when
      it shouldn't.
      The fix reverts back to the two original (and separate) `if` statements.
    • Guy Jacob's avatar
      a0436c26
    • Guy Jacob's avatar
  12. Jul 03, 2019
  13. Jul 02, 2019
  14. Jul 01, 2019
  15. Jun 25, 2019
    • Neta Zmora's avatar
      Checkpoint loading: allow loading non-strict state-keys (#300) · 5c83a044
      Neta Zmora authored
      * Checkpoint loading: allow loading non-strict state-keys
      
      Change the default behavior of load_state_dict() so that the
      keys in the loaded checkpoint do not need to match exactly the
      keys in Distiller's model.
      
      However, we placed some restriction on non-strict checkpoint loading: 
      Even when loading checkpoints non-strict, we raise an exception if some keys
      are missing (extra keys are accepted).
      This is because the time-wasting potential of loading (and using) a model which only contains part of the state-keys (while the user expects it to contain all of a model's state-keys) is too large.
      We want the user to be completely aware that not all of the state-keys are initialized from the loaded checkpoint.
      Unverified
      5c83a044
    • Guy Jacob's avatar
      Fix citation links in README.md · 6cd22e7d
      Guy Jacob authored
      Unverified
      6cd22e7d
    • Neta Zmora's avatar
      Added citations to the README · 135919ca
      Neta Zmora authored
      Added new paper and community citations and changed the section name to *Community*.
      Added Bar Elharar and Lev Zlotnik to the authors list.
      Unverified
      135919ca
  16. Jun 24, 2019
  17. Jun 23, 2019
  18. Jun 22, 2019
  19. Jun 19, 2019
    • Neta Zmora's avatar
      Greedy filter pruning: add mobilenet_v1 greedy pruning · df4d39c9
      Neta Zmora authored
      This is discussed in issue #282, although there @Bowenwu1 was
      interested in mobilenet for CIFAR, not ImageNet.
      
      Note that the implementation of the Greedy filter pruning algorithm is
      not generic (but it is easily extensible) and supports only a subset
      of the models.
      
      An example invocation:
      time python3 compress_classifier.py --arch=mobilenet PATH-TO-IMAGENET_DS  --resume=mobilenet_sgd_68.848.pth.tar --greedy --greedy-target-density=0.5 --vs=0 -p=50 --lr=0.1 --gpu=0 --greedy-pruning-step=0.15 --effective-train-size=0.01
      df4d39c9
    • Neta Zmora's avatar
      Add the value of PYTHONPATH to the log · 33419dcf
      Neta Zmora authored
      When we use alternative implementations of the libraries
      in our requirements.txt, we need to set PYTHONPATH to
      point to these libraries.  Because this affects our ability
      to reproduce experiments later, the least we can do is log
      where PYTHONPATH is pointing to.
      33419dcf
    • Lev Zlotnik's avatar
      Model drawing: allow rendering models w/o specifying the dataset (#294) · 093a2e17
      Lev Zlotnik authored
      This commit allows us to draw diagrams of models, even if we don't support the specific dataset used by the model.  All you need is to specify the dimensions of the model inputs.
      * Added `input_shape` argument to `draw_img_classifier_to_file`
      * updated docstring
      093a2e17
  20. Jun 18, 2019
    • Guy Jacob's avatar
      Input shape attribute in models (#292) · 2cab7741
      Guy Jacob authored
      Setting this attribute will become a requirement for some upcoming features
      
      * Add utility function to set input_shape attribute in models
      * Set this attribute in our classification models factory
        (models.create_model())
      * Add shape parameter in get_dummy_input() (in addition to dataset)
      Unverified
      2cab7741
    • Neta Zmora's avatar
      update requirements.txt · 2696542c
      Neta Zmora authored
      - Replace `sklearn` with scikit-learn.
      - Freeze the `gym` version
      Unverified
      2696542c
Loading