Skip to content
Snippets Groups Projects
  1. Jun 30, 2018
    • Neta Zmora's avatar
    • Neta Zmora's avatar
      Bug fix: add support for thinning the optimizer · b21f449b
      Neta Zmora authored
      You no longer need to use —momentum=0 when removing structures
      dynamically.
      The SGD momentum update (velocity) is dependent on the weights, which
      PyTorch optimizers cache internally.  This caching is not a problem for
      filter/channel removal (thinning) because although we dynamically
      change the shapes of the weights tensors, we don’t change the weights
      tensors themselves.
      PyTorch’s SGD creates tensors to store the momentum updates, and these
      tensors have the same shape as the weights tensors.  When we change the
      weights tensors, we need to make the appropriate changes in the Optimizer,
      or disable the momentum.
      We added a new function - thinning.optimizer_thinning() - to do this.
      This function is brittle as it is tested only on optim.SGD and relies on the
      internal representation of the SGD optimizer, which can change w/o notice.
      For example, optim.Adam uses state['exp_avg'], state['exp_avg_sq']
      Which also depend the shape of the weight tensors.
      We needed to pass the Optimizer instance to Thinning policies
      (ChannelRemover, FilterRemover) via the callbacks, which required us
      to change the callback interface.
      In the future we plan a bigger change to the callback API, to allow
      passing of arbitrary context from the training environment to Distiller.
      
      Also in this commit:
      * compress_classifier.py had special handling for resnet layer-removal, which
      is used in examples/ssl/ssl_4D-removal_training.yaml.
      This is a brittle and ugly hack.  Until we have a more elegant solution, I’m
      Removing support for layer-removal.
      * Added to the tests invocation of forward and backward passes over a model.
      This tests more of the real flows, which use the optimizer and construct
      gradient tensors.
      * Added a test of a special case of convolution filter-pruning which occurs
      when the next layer is fully-connected (linear)
      b21f449b
  2. Jun 21, 2018
  3. Jun 19, 2018
    • Guy Jacob's avatar
      Make PNG summary compatible with latest SummaryGraph class changes (#7) · 9e57219e
      Guy Jacob authored
      * Modify 'create_png' to use the correct data structures (dicts instead
        lists, etc.)
      * Handle case where an op was called not from a module. This relates to:
        * ONNX->"User-Friendly" name conversion to account for cases where
        * Detection of existing op with same name
        In both cases use the ONNX op type in addition to the op name
      * Return an "empty" shape instead of None when ONNX couldn't infer
        a parameter's shape
      * Expose option of PNG summary with parameters to user
      Unverified
      9e57219e
  4. May 17, 2018
    • Neta Zmora's avatar
      Fix system tests failure · a7ed8cad
      Neta Zmora authored
      The latest changes to the logger caused the CI tests to fail,
      because test assumes that the logging.conf file is present in the
      same directory as the sample application script.
      The sample application used cwd() instead, and did not find the
      log configuration file.
      a7ed8cad
  5. May 16, 2018
    • Neta Zmora's avatar
      refactoring: move config_pylogger out of the sample app · 792e9e39
      Neta Zmora authored
      Soon we will be reusing this function in other sample apps, so let's
      move it to app_utils.
      792e9e39
    • Neta Zmora's avatar
      Check if correct version of PyTorch is installed. · ba653d9a
      Neta Zmora authored
      The 'master' branch now uses PyTorch 0.4, which has API changes that
      are not backward compatible with PyTorch 0.3.
      
      After we've upgraded Distiller's internal implementation to be
      compatible with PyTorch 0.4, we've added a check that you are using
      the correct PyTorch version.
      
      Note that we only perform this check in the sample image classifier
      compression application.
      ba653d9a
    • Neta Zmora's avatar
      refactoring: move the message logger setup out of main() · 6e8b0fd6
      Neta Zmora authored
      Eventually we will want to use this code in other sample applications,
      so let's move the logger configuration code to a separate function.
      
      There's a bit of ugly hacking in this current implementation because
      I've added variable members to logging.logger.  These are actaully
      config-once variables that convey the logging directory and filename.
      I did not want to add more names to the global namespace, so I hacked
      a temporary solution in which logging.logger is acts as a conveyor and
      private namespace.  We'll get that cleaned up as we do more refactoring.
      6e8b0fd6
    • Neta Zmora's avatar
      New summary option: print modules names · 6a940466
      Neta Zmora authored
      This is a niche feature, which lets you print the names of the modules
      in a model, from the command-line.
      Non-leaf nodes are excluded from this list.  Other caveats are documented
      in the code.
      6a940466
    • Neta Zmora's avatar
      PNG summary: default to non-parallel graphs · 53b74ca6
      Neta Zmora authored
      Data parallel models may execute faster on multiple GPUs, but rendering
      them creates visually complex and illegible graphs.
      Therefore, when creating models for a PNG summary, we opt to use
      non-parallel models.
      53b74ca6
    • Neta Zmora's avatar
      7ce11aee
  6. May 14, 2018
  7. Apr 24, 2018
Loading