Skip to content
Snippets Groups Projects
Commit 0c175c94 authored by Neta Zmora's avatar Neta Zmora
Browse files

Add README.md files for some pruning examples

parent 70e26735
No related branches found
No related tags found
No related merge requests found
## Hybrid-Pruning Schedules
The examples in this directory show hybrid pruning schedules in which we combine several different pruning strategies.
1. [alexnet.schedule_agp_2Dreg.yaml](https://github.com/NervanaSystems/distiller/blob/master/examples/hybrid/alexnet.schedule_agp_2Dreg.yaml)
<br>
This example presents a pruning-schedule that performs element-wise (fine grain) pruning,
with 2D group (kernel) regularization. The regularization "pushes" 2D kernels towards zero, while
the pruning attends to individual weights coefficients. The pruning schedule is driven by AGP.
2. [alexnet.schedule_sensitivity_2D-reg.yaml](https://github.com/NervanaSystems/distiller/blob/master/examples/hybrid/alexnet.schedule_sensitivity_2D-reg.yaml)
<br>
This example also presents a pruning-schedule that performs element-wise (fine grain) pruning,
with 2D group (kernel) regularization. However, the pruner is a `Distiller.pruning.SensitivityPruner` which is
driven by the tensors' [sensitivity](https://nervanasystems.github.io/distiller/algo_pruning.html#sensitivity-pruner), instead of AGP.
|Experiment| Model | Sparsity | Top1 | Baseline Top1
| :---: | --- | :---: | ---: | ---: |
|1| Alexnet | 88.31| 56.40 | 56.55
|2| Alexnet | 88.31| 56.24 | 56.55
\ No newline at end of file
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
# with 2D structure regularization for the Convolution weights. # with 2D structure regularization for the Convolution weights.
# #
# time python3 compress_classifier.py -a alexnet --lr 0.005 -p 50 ../../../data.imagenet -j 24 --epochs 90 --pretrained --compress=../hybrid/alexnet.schedule_sensitivity_2D-reg.yaml # time python3 compress_classifier.py -a alexnet --lr 0.005 -p 50 ../../../data.imagenet -j 24 --epochs 90 --pretrained --compress=../hybrid/alexnet.schedule_sensitivity_2D-reg.yaml
# time python3 compress_classifier.py -a alexnet --lr 0.005 -p 50 ../../../data.imagenet -j 24 --epochs 90 --pretrained --compress=../hybrid/alexnet.schedule_sensitivity_2D-reg.yaml #
# Parameters: # Parameters:
# #
# +----+---------------------------+------------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------+ # +----+---------------------------+------------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------+
......
## Network Surgery Pruning
### Examples
| Model | Granularity | Sparsity (%) | Top1 | Baseline Top1
| --- | :--- | ---: | ---: | ---: |
| ResNet-50 | Fine | 80.0 | 75.49 | 76.15
| ResNet-50 | Fine | 82.6 | 75.52 | 76.15
| ResNet-20 | Fine | 69.1 | 91.43 | 91.78
## Network Trimming Pruning
### Examples
In theses example schedules, after pruning the filters, we remove them ("thinning") and continue fine-tuning.
| Model | Granularity | Sparsity (%) | Parameters Kept (%) | Compute Kept (%)| Top1 | Baseline Top1
| --- | :--- | ---: | ---: | ---: | ---: | ---: |
| ResNet-50 | Filters| 0.0 | 43.37 | 44.56 | 73.93 | 76.15
| ResNet-56 | Filters| 0.0 | 74.53 | 62.71 | 93.03 | 92.85
| ResNet-56 | Filters| 0.0 | 67.02 | 53.92 | 92.59 | 92.85
...@@ -8,10 +8,11 @@ ...@@ -8,10 +8,11 @@
# Baseline results: # Baseline results:
# Top1: 92.850 Top5: 99.780 Loss: 0.364 # Top1: 92.850 Top5: 99.780 Loss: 0.364
# Total MACs: 125,747,840 # Total MACs: 125,747,840
# # Total parameters: 851504
# Results: # Results:
# Top1: 93.030 Top5: 99.650 Loss: 1.533 # Top1: 93.030 Top5: 99.650 Loss: 1.533
# Total MACs: 78,856,832 # Total MACs: 78,856,832
# Total parameters: 634640 (74.53%)
# #
# #
# time python3 compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../network_trimming/resnet56_cifar_activation_apoz.yaml --resume-from=checkpoint.resnet56_cifar_baseline.pth.tar --reset-optimizer --act-stats=valid # time python3 compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../network_trimming/resnet56_cifar_activation_apoz.yaml --resume-from=checkpoint.resnet56_cifar_baseline.pth.tar --reset-optimizer --act-stats=valid
......
...@@ -3,16 +3,16 @@ ...@@ -3,16 +3,16 @@
# Compare this to examples/pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank.yaml - the pruning time is # Compare this to examples/pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank.yaml - the pruning time is
# much longer due to the callbacks required for collecting the activation statistics (this can be improved by disabling # much longer due to the callbacks required for collecting the activation statistics (this can be improved by disabling
# of the detailed records collection, for example). # of the detailed records collection, for example).
# This provides 62.7% compute compression (x1.6) while increasing the Top1. # This provides 53.92% compute compression (x1.85).
# #
# Baseline results: # Baseline results:
# Top1: 92.850 Top5: 99.780 Loss: 0.364 # Top1: 92.850 Top5: 99.780 Loss: 0.364
# Total MACs: 125,747,840 # Total MACs: 125,747,840
# # Total parameters: 851504
# Results: # Results:
# Top1: 92.590 Top5: 99.630 Loss: 1.537 # Top1: 92.590 Top5: 99.630 Loss: 1.537
# Total MACs: 67,797,632 # Total MACs: 67,797,632
# # Total parameters: 570704 (67.02%)
# #
# time python3 compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../network_trimming/resnet56_cifar_activation_apoz_v2.yaml --resume-from=checkpoint.resnet56_cifar_baseline.pth.tar --reset-optimizer --act-stats=valid # time python3 compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../network_trimming/resnet56_cifar_activation_apoz_v2.yaml --resume-from=checkpoint.resnet56_cifar_baseline.pth.tar --reset-optimizer --act-stats=valid
# #
......
# We used this schedule to train CIFAR10-ResNet56 from scratch
#
# time python3 compress_classifier.py --arch resnet56_cifar ../../../data.cifar10 -p=50 --lr=0.3 --epochs=180 --compress=../pruning_filters_for_efficient_convnets/resnet56_cifar_baseline_training.yaml -j=1 --deterministic
#
# Target: 6.96% error was reported Pruning Filters for Efficient Convnets
#
# Parameters:
# +----+-------------------------------------+----------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------+
# | | Name | Shape | NNZ (dense) | NNZ (sparse) | Cols (%) | Rows (%) | Ch (%) | 2D (%) | 3D (%) | Fine (%) | Std | Mean | Abs-Mean |
# |----+-------------------------------------+----------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------|
# | 0 | module.conv1.weight | (16, 3, 3, 3) | 432 | 432 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.39191 | 0.00826 | 0.18757 |
# | 1 | module.layer1.0.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.08334 | -0.00180 | 0.03892 |
# | 2 | module.layer1.0.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.08565 | -0.00033 | 0.05106 |
# | 3 | module.layer1.1.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.08190 | 0.00082 | 0.04765 |
# | 4 | module.layer1.1.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.08365 | -0.00600 | 0.05459 |
# | 5 | module.layer1.2.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.09640 | -0.00182 | 0.06337 |
# | 6 | module.layer1.2.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.09881 | -0.00400 | 0.07056 |
# | 7 | module.layer1.3.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.13412 | -0.00416 | 0.08827 |
# | 8 | module.layer1.3.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.12693 | -0.00271 | 0.09395 |
# | 9 | module.layer1.4.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.12149 | -0.01105 | 0.09064 |
# | 10 | module.layer1.4.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.11322 | 0.00333 | 0.08556 |
# | 11 | module.layer1.5.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.12076 | -0.01164 | 0.09311 |
# | 12 | module.layer1.5.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.11627 | -0.00355 | 0.08882 |
# | 13 | module.layer1.6.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.12492 | -0.00637 | 0.09493 |
# | 14 | module.layer1.6.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.11240 | -0.00837 | 0.08710 |
# | 15 | module.layer1.7.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.13819 | -0.00735 | 0.10096 |
# | 16 | module.layer1.7.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.11107 | -0.00293 | 0.08613 |
# | 17 | module.layer1.8.conv1.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.12269 | -0.01133 | 0.09511 |
# | 18 | module.layer1.8.conv2.weight | (16, 16, 3, 3) | 2304 | 2304 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.09276 | 0.00240 | 0.07117 |
# | 19 | module.layer2.0.conv1.weight | (32, 16, 3, 3) | 4608 | 4608 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.13876 | -0.01190 | 0.11061 |
# | 20 | module.layer2.0.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.12728 | -0.00499 | 0.10012 |
# | 21 | module.layer2.0.downsample.0.weight | (32, 16, 1, 1) | 512 | 512 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.24306 | -0.01255 | 0.19073 |
# | 22 | module.layer2.1.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.11474 | -0.00995 | 0.09044 |
# | 23 | module.layer2.1.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.10452 | -0.00440 | 0.08196 |
# | 24 | module.layer2.2.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.09873 | -0.00629 | 0.07833 |
# | 25 | module.layer2.2.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.08747 | -0.00393 | 0.06891 |
# | 26 | module.layer2.3.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.09434 | -0.00762 | 0.07469 |
# | 27 | module.layer2.3.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.07984 | -0.00449 | 0.06271 |
# | 28 | module.layer2.4.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.08767 | -0.00733 | 0.06852 |
# | 29 | module.layer2.4.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.06642 | -0.00396 | 0.05196 |
# | 30 | module.layer2.5.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.07521 | -0.00699 | 0.05799 |
# | 31 | module.layer2.5.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.05739 | -0.00351 | 0.04334 |
# | 32 | module.layer2.6.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.06130 | -0.00595 | 0.04791 |
# | 33 | module.layer2.6.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.04703 | -0.00519 | 0.03527 |
# | 34 | module.layer2.7.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.06366 | -0.00734 | 0.04806 |
# | 35 | module.layer2.7.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.04591 | -0.00131 | 0.03282 |
# | 36 | module.layer2.8.conv1.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.05903 | -0.00606 | 0.04555 |
# | 37 | module.layer2.8.conv2.weight | (32, 32, 3, 3) | 9216 | 9216 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.04344 | -0.00566 | 0.03290 |
# | 38 | module.layer3.0.conv1.weight | (64, 32, 3, 3) | 18432 | 18432 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.08262 | 0.00251 | 0.06520 |
# | 39 | module.layer3.0.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.06248 | 0.00073 | 0.04578 |
# | 40 | module.layer3.0.downsample.0.weight | (64, 32, 1, 1) | 2048 | 2048 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.12275 | 0.01139 | 0.08651 |
# | 41 | module.layer3.1.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03438 | -0.00186 | 0.02419 |
# | 42 | module.layer3.1.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03091 | -0.00368 | 0.02203 |
# | 43 | module.layer3.2.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03477 | -0.00226 | 0.02499 |
# | 44 | module.layer3.2.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03012 | -0.00350 | 0.02159 |
# | 45 | module.layer3.3.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03577 | -0.00166 | 0.02608 |
# | 46 | module.layer3.3.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.02962 | -0.00124 | 0.02115 |
# | 47 | module.layer3.4.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03694 | -0.00285 | 0.02677 |
# | 48 | module.layer3.4.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.02916 | -0.00165 | 0.02024 |
# | 49 | module.layer3.5.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03158 | -0.00180 | 0.02342 |
# | 50 | module.layer3.5.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.02527 | -0.00177 | 0.01787 |
# | 51 | module.layer3.6.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03074 | -0.00169 | 0.02256 |
# | 52 | module.layer3.6.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.02406 | -0.00006 | 0.01658 |
# | 53 | module.layer3.7.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.03160 | -0.00249 | 0.02294 |
# | 54 | module.layer3.7.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.02298 | -0.00083 | 0.01553 |
# | 55 | module.layer3.8.conv1.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.02594 | -0.00219 | 0.01890 |
# | 56 | module.layer3.8.conv2.weight | (64, 64, 3, 3) | 36864 | 36864 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.01986 | -0.00061 | 0.01318 |
# | 57 | module.fc.weight | (10, 64) | 640 | 640 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.52562 | -0.00003 | 0.39168 |
# | 58 | Total sparsity: | - | 851504 | 851504 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 | 0.00000 |
# +----+-------------------------------------+----------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------+
# 2018-07-02 16:36:31,555 - Total sparsity: 0.00
#
# 2018-07-02 16:36:31,555 - --- validate (epoch=179)-----------
# 2018-07-02 16:36:31,555 - 5000 samples (256 per mini-batch)
# 2018-07-02 16:36:33,121 - ==> Top1: 91.520 Top5: 99.680 Loss: 0.387
#
# 2018-07-02 16:36:33,123 - Saving checkpoint to: logs/2018.07.02-152746/checkpoint.pth.tar
# 2018-07-02 16:36:33,159 - --- test ---------------------
# 2018-07-02 16:36:33,159 - 10000 samples (256 per mini-batch)
# 2018-07-02 16:36:36,194 - ==> Top1: 92.850 Top5: 99.780 Loss: 0.364
lr_schedulers:
training_lr:
class: StepLR
step_size: 45
gamma: 0.10
policies:
- lr_scheduler:
instance_name: training_lr
starting_epoch: 35
ending_epoch: 200
frequency: 1
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment