Skip to content
Snippets Groups Projects
  • Bar's avatar
    992291cf
    Load optimizer from checkpoint (BREAKING - see details) (#182) · 992291cf
    Bar authored
    Load optimizer from checkpoint (BREAKING - see details) (#182)
    
    * Fixes issues #70, #145 and replaces PR #74
    * checkpoint.py
      * save_checkpoint will now save the optimizer type in addition to
        its state
      * load_checkpoint will now instantiate an optimizer based on the
        saved type and load its state
    * config.py: file/dict_config now accept the resumed epoch to pass to
      LR schedulers
    * policy.py: LRPolicy now passes the current epoch to the LR scheduler
    * Classifier compression sample
      * New flag '--resume-from' for properly resuming a saved training
        session, inc. optimizer state and epoch #
      * Flag '--reset-optimizer' added to allow discarding of a loaded
        optimizer.
      * BREAKING:
        * Previous flag '--resume' is deprecated and is mapped to
          '--resume-from' + '--reset-optimizer'. 
        * But, old resuming behavior had an inconsistency where the epoch
          count would continue from the saved epoch, but the LR scheduler
          was setup as if we were starting from epoch 0.
        * Using '--resume-from' + '--reset-optimizer' now will simply
          RESET the epoch count to 0 for the whole environment.
        * This means that scheduling configurations (in YAML or code)
          which assumed use of '--resume' might need to be changed to
          reflect the fact that the epoch count now starts from 0
        * All relevant YAML files under 'examples' modified to reflect
          this change
    * Initial support for ReduceLROnPlateau (#161):
      * Allow passing **kwargs to policies via the scheduler
      * Image classification now passes the validation loss to the
        scheduler, to be used yo ReduceLROnPlateau
      * The current implementation is experimental and subject to change
    992291cf
    History
    Load optimizer from checkpoint (BREAKING - see details) (#182)
    Bar authored
    Load optimizer from checkpoint (BREAKING - see details) (#182)
    
    * Fixes issues #70, #145 and replaces PR #74
    * checkpoint.py
      * save_checkpoint will now save the optimizer type in addition to
        its state
      * load_checkpoint will now instantiate an optimizer based on the
        saved type and load its state
    * config.py: file/dict_config now accept the resumed epoch to pass to
      LR schedulers
    * policy.py: LRPolicy now passes the current epoch to the LR scheduler
    * Classifier compression sample
      * New flag '--resume-from' for properly resuming a saved training
        session, inc. optimizer state and epoch #
      * Flag '--reset-optimizer' added to allow discarding of a loaded
        optimizer.
      * BREAKING:
        * Previous flag '--resume' is deprecated and is mapped to
          '--resume-from' + '--reset-optimizer'. 
        * But, old resuming behavior had an inconsistency where the epoch
          count would continue from the saved epoch, but the LR scheduler
          was setup as if we were starting from epoch 0.
        * Using '--resume-from' + '--reset-optimizer' now will simply
          RESET the epoch count to 0 for the whole environment.
        * This means that scheduling configurations (in YAML or code)
          which assumed use of '--resume' might need to be changed to
          reflect the fact that the epoch count now starts from 0
        * All relevant YAML files under 'examples' modified to reflect
          this change
    * Initial support for ReduceLROnPlateau (#161):
      * Allow passing **kwargs to policies via the scheduler
      * Image classification now passes the validation loss to the
        scheduler, to be used yo ReduceLROnPlateau
      * The current implementation is experimental and subject to change
resnet20_cifar10_checkpoint.pth.tar 2.12 MiB