Bar
authored
Load optimizer from checkpoint (BREAKING - see details) (#182)
* Fixes issues #70, #145 and replaces PR #74
* checkpoint.py
* save_checkpoint will now save the optimizer type in addition to
its state
* load_checkpoint will now instantiate an optimizer based on the
saved type and load its state
* config.py: file/dict_config now accept the resumed epoch to pass to
LR schedulers
* policy.py: LRPolicy now passes the current epoch to the LR scheduler
* Classifier compression sample
* New flag '--resume-from' for properly resuming a saved training
session, inc. optimizer state and epoch #
* Flag '--reset-optimizer' added to allow discarding of a loaded
optimizer.
* BREAKING:
* Previous flag '--resume' is deprecated and is mapped to
'--resume-from' + '--reset-optimizer'.
* But, old resuming behavior had an inconsistency where the epoch
count would continue from the saved epoch, but the LR scheduler
was setup as if we were starting from epoch 0.
* Using '--resume-from' + '--reset-optimizer' now will simply
RESET the epoch count to 0 for the whole environment.
* This means that scheduling configurations (in YAML or code)
which assumed use of '--resume' might need to be changed to
reflect the fact that the epoch count now starts from 0
* All relevant YAML files under 'examples' modified to reflect
this change
* Initial support for ReduceLROnPlateau (#161):
* Allow passing **kwargs to policies via the scheduler
* Image classification now passes the validation loss to the
scheduler, to be used yo ReduceLROnPlateau
* The current implementation is experimental and subject to change| Name | Last commit | Last update |
|---|---|---|
| .. | ||
| resnet20_cifar10_checkpoint.pth.tar | ||