examples/classifier_compression/compress_classifier.py · 87055fed974f4e4e0851d6c03c0859a4d29e1dc9 · llvm / distiller

6 years ago

compress_classifier.py: Fix best_epoch logic · 87055fed

Neta Zmora authored 6 years ago

Based on a commit and ideas from @barrh:
https://github.com/NervanaSystems/distiller/pull/150/commits/1623db3cdc3a95ab620e2dc6863cff23a91087bd

The sample application compress_classifier.py logs details about
the best performing epoch(s) and stores the best epoch in a checkpoint
file named ```best.pth.tar``` by default (if you use the ```--name```
application argument, the checkpoint name will be prefixed by ```best```).

Until this fix, the performance of a model was judged solely on its
Top1 accuracy. This can be a problem when performing gradual pruning
of a pre-trained model, because many times a model's Top1 accuracy
increases with light pruning and this is registered as the best performing
training epoch. However, we are really interested in the best performing
trained model _after_ the pruning phase is done. Even during training, we
may be interested in the checkpoint of the best performing model with the
highest sparsity.
This fix stores a list of the performance results from all the trained
epochs so far. This list is sorted using a hierarchical key:
(sparsity, top1, top5, epoch), so that the list is first sorted by sparsity,
then top1, followed by top5 and epoch.

But what if you want to sort using a different metric? For example, when
quantizing you may want to score the best performance by the total number of
bits used to represent the model parameters and feature-maps. In such a case
you may want to replace ```sparsity``` by this new metric. Because this is a
sample application, we don't load it with all possible control logic, and
anyone can make local changes to this logic. To keep your code separated from
the main application logic, we plan to refactor the application code sometime
in the next few months.

87055fed

History

compress_classifier.py: Fix best_epoch logic

Neta Zmora authored 6 years ago

Based on a commit and ideas from @barrh:
https://github.com/NervanaSystems/distiller/pull/150/commits/1623db3cdc3a95ab620e2dc6863cff23a91087bd

compress_classifier.py 36.21 KiB