- Mar 17, 2019
-
-
Neta Zmora authored
In several places we hit an error state and exit using exit(), instead of raising a ValueError - fixed this.
-
- Mar 12, 2019
-
-
Bar authored
"Peformance" --> "Performance"
-
- Mar 06, 2019
-
-
Neta Zmora authored
A recent commit changed the sorting of the best performing training epochs to be based on the sparsity level of the model, then its Top1 and Top5 scores. When we create thinned models, the sparsity remains low (even zero), while the physical size of the network is smaller. This commit changes the sorting criteria to be based on the count of non-zero (NNZ) parameters. This captures both sparsity and parameter size objectives: - When sparsity is high, the number of NNZ params is low (params_nnz_cnt = sparsity * params_cnt). - When we remove structures (thinnning), the sparsity may remain constant, but the count of params (params_cnt) is lower, and therefore, once again params_nnz_cnt is lower. Therefore, params_nnz_cnt is a good proxy to capture a sparsity objective and/or a thinning objective.
-
- Mar 03, 2019
-
-
Neta Zmora authored
Based on a commit and ideas from @barrh: https://github.com/NervanaSystems/distiller/pull/150/commits/1623db3cdc3a95ab620e2dc6863cff23a91087bd The sample application compress_classifier.py logs details about the best performing epoch(s) and stores the best epoch in a checkpoint file named ```best.pth.tar``` by default (if you use the ```--name``` application argument, the checkpoint name will be prefixed by ```best```). Until this fix, the performance of a model was judged solely on its Top1 accuracy. This can be a problem when performing gradual pruning of a pre-trained model, because many times a model's Top1 accuracy increases with light pruning and this is registered as the best performing training epoch. However, we are really interested in the best performing trained model _after_ the pruning phase is done. Even during training, we may be interested in the checkpoint of the best performing model with the highest sparsity. This fix stores a list of the performance results from all the trained epochs so far. This list is sorted using a hierarchical key: (sparsity, top1, top5, epoch), so that the list is first sorted by sparsity, then top1, followed by top5 and epoch. But what if you want to sort using a different metric? For example, when quantizing you may want to score the best performance by the total number of bits used to represent the model parameters and feature-maps. In such a case you may want to replace ```sparsity``` by this new metric. Because this is a sample application, we don't load it with all possible control logic, and anyone can make local changes to this logic. To keep your code separated from the main application logic, we plan to refactor the application code sometime in the next few months.
-
Neta Zmora authored
Release 0.3 broke the expots to PNG and ONNX and this is the fix.
-
- Feb 28, 2019
-
-
Neta Zmora authored
-
Neta Zmora authored
-
- Feb 26, 2019
-
-
Lev Zlotnik authored
Not backward compatible - re-installation is required * Fixes for PyTorch==1.0.0 * Refactoring folder structure * Update installation section in docs
-
- Feb 17, 2019
-
-
Neta Zmora authored
A small change to support ranking weight filters by the mean mean-value of the feature-map channels. Mean mean-value refers to computing the average value (across many input images) of the mean-value of each channel.
-
- Feb 14, 2019
-
-
Bar authored
Modified log_execution_env_state() to store configuration file in the output directory, under 'configs' sub-directory it creates. At this time, the only configuration file is passed via args.compress
-
Neta Zmora authored
To use automated compression you need to install several optional packages which are not required for other use-cases. This fix hides the import requirements for users who do not want to install the extra packages.
-
- Feb 13, 2019
-
-
Neta Zmora authored
Merging the 'amc' branch with 'master'. This updates the automated compression code in 'master', and adds a greedy filter-pruning algorithm.
-
- Feb 11, 2019
-
-
Guy Jacob authored
Summary of changes: (1) Post-train quantization based on pre-collected statistics (2) Quantized concat, element-wise addition / multiplication and embeddings (3) Move post-train quantization command line args out of sample code (4) Configure post-train quantization from YAML for more fine-grained control (See PR #136 for more detailed changes descriptions)
-
- Feb 10, 2019
-
-
Guy Jacob authored
* For CIFAR-10 / ImageNet only * Refactor data_loaders.py, reduce code duplication * Implemented custom sampler * Integrated in image classification sample * Since we now shuffle the test set, had to update expected results in 2 full_flow_tests that do evaluation
-
- Jan 31, 2019
-
-
Neta Zmora authored
-
- Jan 16, 2019
-
-
Bar authored
* Support for multi-phase activations logging Enable logging activation both durning training and validation at the same session. * Refactoring: Move parser to its own file * Parser is moved from compress_classifier into its own file. * Torch version check is moved to precede main() call. * Move main definition to the top of the file. * Modify parser choices to case-insensitive
-
- Jan 15, 2019
-
-
Neta Zmora authored
Fix a mismatch between the location of the model and the computation.
-
- Jan 13, 2019
-
-
Neta Zmora authored
-
- Jan 10, 2019
-
-
Gal Novik authored
In compress_classifier.py we added a new application argument: --cpu which you can use to force compute (training/inference) to run on the CPU when you invoke compress_classifier.py on a machine which has Nvidia GPUs. If your machine lacks Nvidia GPUs, then the compute will now run on the CPU (and you do not need the new flag). Caveat: we did not fully test the CPU support for the code in the Jupyter notebooks. If you find a bug, we apologize and appreciate your feedback.
-
- Dec 19, 2018
-
-
Neta Zmora authored
If compression_scheduler==None, then we need to set the value of losses[OVERALL_LOSS_KEY] (so it is the same as losses[OBJECTIVE_LOSS_KEY]). This was overlooked.
-
- Dec 16, 2018
-
-
Taras Sereda authored
-
- Dec 14, 2018
-
-
Neta Zmora authored
Added notebook for visualizing the discovery of compressed networks. Added one-epoch fine-tuning at the end of every episode, which is required for very sensitive models like Plain20.
-
- Dec 11, 2018
-
-
Haim Barad authored
Revert back to Pytorch 0.4.0. Also fixed some numpy calls (for statistics) that needed to be moved back to CPU.
-
Yi-Syuan Chen authored
-
- Dec 06, 2018
-
-
Guangli Li authored
Update the examples of earlyexit arguments which were not consistent with descriptions
-
- Dec 04, 2018
-
-
Guy Jacob authored
* Asymmetric post-training quantization (only symmetric supported so until now) * Quantization aware training for range-based (min-max) symmetric and asymmetric quantization * Per-channel quantization support in both training and post-training * Added tests and examples * Updated documentation
-
Neta Zmora authored
-
- Dec 01, 2018
-
-
Neta Zmora authored
This commit contains the main fix for issue #85. It contains a couple of changes to the YAML structure pruning API, with examples. I urge you to read the documentation in the Wiki (https://github.com/NervanaSystems/distiller/wiki/Pruning-Filters-&-Channels). New syntax for defining Structured AGP. I tried to make the syntax similar to fine-grained (i.e. element-wise) pruning. All you need to do is add: ```group_type: Filters```. ``` low_pruner: class: L1RankedStructureParameterPruner_AGP initial_sparsity : 0.10 final_sparsity: 0.50 group_type: Filters weights: [module.layer3.0.conv2.weight, module.layer3.0.downsample.0.weight, module.layer3.1.conv2.weight, module.layer3.2.conv2.weight] ``` If you want to define “leader-based” pruning dependencies, add ```group_dependency: Leader```: ``` low_pruner: class: L1RankedStructureParameterPruner_AGP initial_sparsity : 0.10 final_sparsity: 0.50 group_type: Filters group_dependency: Leader weights: [module.layer3.0.conv2.weight, module.layer3.0.downsample.0.weight, module.layer3.1.conv2.weight, module.layer3.2.conv2.weight] ``` Retired the old ```reg_regims``` API for describing one-shot structured-pruning. The new YAML API is very similar to AGP structured-pruning, which is much better than before. The new API also allows us to describe data-dependencies when doing one-shot structure pruning, just like AGP structured-pruning. This commit also includes further code refactoring. Old API: ``` filter_pruner: class: 'L1RankedStructureParameterPruner' reg_regims: 'module.layer1.0.conv1.weight': [0.6, '3D'] 'module.layer1.1.conv1.weight': [0.6, '3D'] ``` New API: ``` filter_pruner: class: 'L1RankedStructureParameterPruner' group_type: Filters desired_sparsity: 0.6 weights: [ module.layer1.0.conv1.weight, module.layer1.1.conv1.weight] ``` thresholding.py – separate the generation of the binary_map from the pruning_mask so that we can cache the binary map and share it between several modules. pruning/automated_gradual_pruner.py – major refactoring to supported “leader-based” sub-graph pruning dependencies. The concept is explained in issue #85 agp-pruning/resnet20_filters.schedule_agp.yaml agp-pruning/resnet20_filters.schedule_agp_2.yaml agp-pruning/resnet20_filters.schedule_agp_3.yaml network_trimming/resnet56_cifar_activation_apoz.yaml network_trimming/resnet56_cifar_activation_apoz_v2.yaml
-
- Nov 24, 2018
-
-
Guy Jacob authored
-
- Nov 22, 2018
-
-
Neta Zmora authored
* Fix issue #79 Change the default values so that the following scheduler meta-data keys are always defined: 'starting_epoch', 'ending_epoch', 'frequency' * compress_classifier.py: add a new argument Allow the specification, from the command line arguments, of the range of pruning levels scanned when doing sensitivity analysis * Add regression test for issue #79
-
- Nov 21, 2018
-
-
Neta Zmora authored
Trying to simplify the code.
-
- Nov 20, 2018
-
-
Neta Zmora authored
* Bug fix: value of best_top1 stored in the checkpoint may be wrong If you invoke compress_clasifier.py with --num-best-scores=n with n>1, then the value of best_top1 stored in checkpoints is wrong.
-
Neta Zmora authored
When we resume from a checkpoint, we usually want to continue using the checkpoint’s masks. I say “usually” because I can see a situation where we want to prune a model and checkpoint it, and then resume with the intention of fine-tuning w/o keeping the masks. This is what’s done in Song Han’s Dense-Sparse-Dense (DSD) training (https://arxiv.org/abs/1607.04381). But I didn’t want to add another argument to ```compress_classifier.py``` for the time being – so we ignore DSD. There are two possible situations when we resume a checkpoint that has a serialized ```CompressionScheduler``` with pruning masks: 1. We are planning on using a new ```CompressionScheduler``` that is defined in a schedule YAML file. In this case, we want to copy the masks from the serialized ```CompressionScheduler``` to the new ```CompressionScheduler``` that we are constructing from the YAML file. This is one fix. 2. We are resuming a checkpoint, but without using a YAML schedule file. In this case we want to use the ```CompressionScheduler``` that we loaded from the checkpoint file. All this ```CompressionScheduler``` does is keep applying the masks as we train, so that we don’t lose them. This is the second fix. For DSD, we would need a new flag that would override using the ```CompressionScheduler``` that we load from the checkpoint.
-
- Nov 08, 2018
-
-
Haim Barad authored
* Updated stats computation - fixes issues with validation stats * Clarification of output (docs) * Update * Moved validation stats to separate function
-
Guy Jacob authored
-
- Nov 06, 2018
-
-
Neta Zmora authored
By default, when we create a model we wrap it with DataParallel to benefit from data-parallelism across GPUs (mainly for convolution layers). But sometimes we don't want the sample application to do this: for example when we receive a model that was trained serially. This commit adds a new argument to the application to prevent the use of DataParallel.
-
Haim Barad authored
* Fixed validation stats and added new summary stats * Trimmed some comments. * Improved figure for documentation * Minor updates
-
- Nov 05, 2018
-
-
Neta Zmora authored
Added an implementation of: Dynamic Network Surgery for Efficient DNNs, Yiwen Guo, Anbang Yao, Yurong Chen. NIPS 2016, https://arxiv.org/abs/1608.04493. - Added SplicingPruner: A pruner that both prunes and splices connections. - Included an example schedule on ResNet20 CIFAR. - New features for compress_classifier.py: 1. Added the "--masks-sparsity" which, when enabled, logs the sparsity of the weight masks during training. 2. Added a new command-line argument to report the top N best accuracy scores, instead of just the highest score. This is sometimes useful when pruning a pre-trained model, that has the best Top1 accuracy in the first few pruning epochs. - New features for PruningPolicy: 1. The pruning policy can use two copies of the weights: one is used during the forward-pass, the other during the backward pass. This is controlled by the “mask_on_forward_only” argument. 2. If we enable “mask_on_forward_only”, we probably want to permanently apply the mask at some point (usually once the pruning phase is done). This is controlled by the “keep_mask” argument. 3. We introduce a first implementation of scheduling at the training-iteration granularity (i.e. at the mini-batch granularity). Until now we could schedule pruning at the epoch-granularity. This is controlled by the “mini_batch_pruning_frequency” (disable by setting to zero). Some of the abstractions may have leaked from PruningPolicy to CompressionScheduler. Need to reexamine this in the future.
-
- Nov 01, 2018
-
-
Guy Jacob authored
* Added command line arguments for this and other post-training quantization settings in image classification sample.
-
- Oct 22, 2018
-
-
Neta Zmora authored
Activation statistics can be leveraged to make pruning and quantization decisions, and so We added support to collect these data. - Two types of activation statistics are supported: summary statistics, and detailed records per activation. Currently we support the following summaries: - Average activation sparsity, per layer - Average L1-norm for each activation channel, per layer - Average sparsity for each activation channel, per layer For the detailed records we collect some statistics per activation and store it in a record. Using this collection method generates more detailed data, but consumes more time, so Beware. * You can collect activation data for the different training phases: training/validation/test. * You can access the data directly from each module that you chose to collect stats for. * You can also create an Excel workbook with the stats. To demonstrate use of activation collection we added a sample schedule which prunes weight filters by the activation APoZ according to: "Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures", Hengyuan Hu, Rui Peng, Yu-Wing Tai, Chi-Keung Tang, ICLR 2016 https://arxiv.org/abs/1607.03250 We also refactored the AGP code (AutomatedGradualPruner) to support structure pruning, and specifically we separated the AGP schedule from the filter pruning criterion. We added examples of ranking filter importance based on activation APoZ (ActivationAPoZRankedFilterPruner), random (RandomRankedFilterPruner), filter gradients (GradientRankedFilterPruner), and filter L1-norm (L1RankedStructureParameterPruner)
-