- Mar 29, 2019
-
-
Songyi Blair Han authored
-
- Mar 28, 2019
-
-
Lev Zlotnik authored
* Added distiller.utils.convert_recursively_to , replaced _treetuple2device in SummaryGraph with it. * Renamed to convert_tensors_recursively_to
-
- Mar 27, 2019
- Mar 26, 2019
-
-
Neta Zmora authored
Line 291 was coded twice (repeated), so removed one instance. There's no functional effect (maybe a very small performance improvement).
-
- Mar 25, 2019
-
-
Neta Zmora authored
Rewrote the splicing logic with simpler code.
-
Neta Zmora authored
- Fix the invocation of resnet50_earlyexit (missing 'pretrained') parameter. - Remove all ResNet depths other than 50, to prevent confusion (these are currently not supported).
-
- Mar 23, 2019
-
-
Neta Zmora authored
This schedule demonstrates high-rate element-wise pruning (84.6% sparsity) of Resnet 50. Top1 is 75.66 vs the published Top1: 76.15 i.e. a drop of -0.5%.
-
Neta Zmora authored
This is an improved ResNet50 AGP schedule which generates a ResNet50 network that is 80% element-wise sparse, with statistically insignificant drop in Top1 accuracy (-0.13%).
-
- Mar 21, 2019
-
-
Neta Zmora authored
This is AGP (automatic gradual pruning) for a pruner which samples filters-to-prune by sampling a Bernoulli probability distribution.
-
- Mar 17, 2019
-
-
Neta Zmora authored
In several places we hit an error state and exit using exit(), instead of raising a ValueError - fixed this.
-
Neta Zmora authored
Fixed the return value from GroupThresholdMixin.group_threshold_mask so that it only returns the mask in all cases (this code is _not_ under test at the moment, and changes to the pruning code, which also uses the thresholding code) led to this bug. Need to add tests for group-lasso regularization.
-
Neta Zmora authored
Replaced numpy operations with pytorch operations (so that we can leverage the GPU).
-
Bar authored
Modify LpRankedStructureParameterPruner to log as the correct class name for both L1 and L2 pruners.
-
Neta Zmora authored
BernoulliFilterPruner – assigns a Bernoulli probability distribution to each of the filters. RandomLevelStructureParameterPruner – assigns a Uniform probability distribution to the level-pruning level used by an L1-norm structure pruner.
-
Neta Zmora authored
A recent change requires us to return the binary_map from the ranking operation, and this was missing for the row-pruning case.
-
- Mar 14, 2019
-
-
Bar authored
-
- Mar 12, 2019
-
-
Bar authored
"Peformance" --> "Performance"
-
- Mar 11, 2019
-
-
Bar authored
Integrate Cadene ```pretrainedmodels``` package. This PR integrates a large set of pre-trained PyTorch image-classification and object-detection models which originate from https://github.com/Cadene/pretrained-models.pytorch. ******************************************************************************************* PLEASE NOTE: This PR adds a dependency on he ```pretrainedmodels``` package, and you will need to install it using ```pip3 install pretrainedmodels```. For new users, we have also updated the ```requirements.txt``` file. ******************************************************************************************* Distiller does not currently support the compression of object-detectors (a sample application is required - and the community is invited to send us a PR). Compression of some of these models may not be fully supported by Distiller due to bugs and/or missing features. If you encounter any issues, please report to us. Whenever there is contention on the names of models passed to the ```compress_classifier.py``` sample application, it will prefer to use the Cadene models at the lowest priority (e.g. Torchvision models are used in favor of Cadene models, when the same model is supported by both packages). This PR also: * Adds documentation to ```create_model``` * Adds tests for ```create_model```
-
- Mar 10, 2019
-
-
Neta Zmora authored
-
- Mar 06, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
This is a utility function that returns some statistics about a model's parameters (model_sparsity, params_cnt, params_nnz_cnt). This file is required for the previous commit (and was accidentally left out)
-
Neta Zmora authored
A recent commit changed the sorting of the best performing training epochs to be based on the sparsity level of the model, then its Top1 and Top5 scores. When we create thinned models, the sparsity remains low (even zero), while the physical size of the network is smaller. This commit changes the sorting criteria to be based on the count of non-zero (NNZ) parameters. This captures both sparsity and parameter size objectives: - When sparsity is high, the number of NNZ params is low (params_nnz_cnt = sparsity * params_cnt). - When we remove structures (thinnning), the sparsity may remain constant, but the count of params (params_cnt) is lower, and therefore, once again params_nnz_cnt is lower. Therefore, params_nnz_cnt is a good proxy to capture a sparsity objective and/or a thinning objective.
-
- Mar 05, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
-
Lev Zlotnik authored
-
Neta Zmora authored
amc-ft-frequency: Sometimes we may want to fine-tune the weights after ‘n’ number of episode steps (action-steps). This new argument controls the frequency of this fine-tuning (FT) How many action-steps between fine-tuning By default, there is no fine-tuning between steps. amc-reward-frequency: By default, we only provide a non-zero reward at the end episodes. This argument allows us to provide rewards at a higher frequency. This commit also reorders the ResNet layer names, so that layers are processed by near-topological order. This is simply to help interpret the data in the AMC Jupyter notebooks.
-
- Mar 03, 2019
-
-
Neta Zmora authored
Based on a commit and ideas from @barrh: https://github.com/NervanaSystems/distiller/pull/150/commits/1623db3cdc3a95ab620e2dc6863cff23a91087bd The sample application compress_classifier.py logs details about the best performing epoch(s) and stores the best epoch in a checkpoint file named ```best.pth.tar``` by default (if you use the ```--name``` application argument, the checkpoint name will be prefixed by ```best```). Until this fix, the performance of a model was judged solely on its Top1 accuracy. This can be a problem when performing gradual pruning of a pre-trained model, because many times a model's Top1 accuracy increases with light pruning and this is registered as the best performing training epoch. However, we are really interested in the best performing trained model _after_ the pruning phase is done. Even during training, we may be interested in the checkpoint of the best performing model with the highest sparsity. This fix stores a list of the performance results from all the trained epochs so far. This list is sorted using a hierarchical key: (sparsity, top1, top5, epoch), so that the list is first sorted by sparsity, then top1, followed by top5 and epoch. But what if you want to sort using a different metric? For example, when quantizing you may want to score the best performance by the total number of bits used to represent the model parameters and feature-maps. In such a case you may want to replace ```sparsity``` by this new metric. Because this is a sample application, we don't load it with all possible control logic, and anyone can make local changes to this logic. To keep your code separated from the main application logic, we plan to refactor the application code sometime in the next few months.
-
Neta Zmora authored
Release 0.3 broke the expots to PNG and ONNX and this is the fix.
-
Neta Zmora authored
See issue #168. This is not a fix, but warns the user of wrong MAC results, until fixing the issue.
-
- Mar 01, 2019
-
-
Yuma Hiramatsu authored
-
- Feb 28, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
-
Neta Zmora authored
-
- Feb 27, 2019
-
- Feb 26, 2019
-
-
Bar authored
Function ```log_execution_env_state``` copies a given configuration file to the logs directory to save all of the details of an experiment. In some distributed a file copy may fail, so we wrap the copy of the configuration file with a try/except block.
-
Lev Zlotnik authored
-