Add README.md files for some pruning examples

0c175c94 · Neta Zmora · 70e26735 · 0c175c94 · 0c175c94 · 0c175c94
Commit 0c175c94 authored 5 years ago by Neta Zmora
--- a/examples/hybrid/README.md
+++ b/examples/hybrid/README.md
+## Hybrid-Pruning Schedules
+The examples in this directory show hybrid pruning schedules in which we combine several different pruning strategies.
+1. [alexnet.schedule_agp_2Dreg.yaml](https://github.com/NervanaSystems/distiller/blob/master/examples/hybrid/alexnet.schedule_agp_2Dreg.yaml)
+<br>
+This example presents a pruning-schedule that performs element-wise (fine grain) pruning, 
+with 2D group (kernel) regularization.  The regularization "pushes" 2D kernels towards zero, while
+the pruning attends to individual weights coefficients.  The pruning schedule is driven by AGP.
+2. [alexnet.schedule_sensitivity_2D-reg.yaml](https://github.com/NervanaSystems/distiller/blob/master/examples/hybrid/alexnet.schedule_sensitivity_2D-reg.yaml)
+<br>
+This example also presents a pruning-schedule that performs element-wise (fine grain) pruning, 
+with 2D group (kernel) regularization.  However, the pruner is a `Distiller.pruning.SensitivityPruner` which is
+driven by the tensors' [sensitivity](https://nervanasystems.github.io/distiller/algo_pruning.html#sensitivity-pruner), instead of AGP.
+|Experiment| Model | Sparsity  | Top1  | Baseline Top1
+| :---: | --- | :---: |    ---: |  ---: |
+|1| Alexnet | 88.31| 56.40 | 56.55
+|2| Alexnet | 88.31| 56.24 | 56.55
\ No newline at end of file
--- a/examples/hybrid/alexnet.schedule_sensitivity_2D-reg.yaml
+++ b/examples/hybrid/alexnet.schedule_sensitivity_2D-reg.yaml
@@ -3,7 +3,7 @@
 # with 2D structure regularization for the Convolution weights.
 #
 # time python3 compress_classifier.py -a alexnet --lr 0.005 -p 50 ../../../data.imagenet -j 24 --epochs 90 --pretrained --compress=../hybrid/alexnet.schedule_sensitivity_2D-reg.yaml
-# time python3 compress_classifier.py -a alexnet --lr 0.005 -p 50 ../../../data.imagenet -j 24 --epochs 90 --pretrained --compress=../hybrid/alexnet.schedule_sensitivity_2D-reg.yaml
+#
 # Parameters:
 #
 # +----+---------------------------+------------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------+

--- a/examples/network_surgery/README.md
+++ b/examples/network_surgery/README.md
+## Network Surgery Pruning
+### Examples
+| Model | Granularity | Sparsity (%) | Top1  | Baseline Top1
+| --- |  :--- |  ---: |  ---: |  ---: |
+| ResNet-50 | Fine | 80.0 | 75.49 | 76.15
+| ResNet-50 | Fine | 82.6 | 75.52 | 76.15
+| ResNet-20 | Fine | 69.1 | 91.43 | 91.78
--- a/examples/network_trimming/README.md
+++ b/examples/network_trimming/README.md
+## Network Trimming Pruning
+### Examples
+In theses example schedules, after pruning the filters, we remove them ("thinning") and continue fine-tuning.
+| Model | Granularity | Sparsity (%) | Parameters Kept (%) | Compute Kept (%)| Top1 | Baseline Top1
+| --- |  :--- |  ---: |  ---: |  ---: | ---: |  ---: |
+| ResNet-50 | Filters| 0.0  | 43.37 | 44.56 | 73.93 | 76.15
+| ResNet-56 | Filters| 0.0  | 74.53 | 62.71 | 93.03 | 92.85
+| ResNet-56 | Filters| 0.0  | 67.02 | 53.92 | 92.59 | 92.85
--- a/examples/network_trimming/resnet56_cifar_activation_apoz.yaml
+++ b/examples/network_trimming/resnet56_cifar_activation_apoz.yaml
@@ -8,10 +8,11 @@
 # Baseline results:
 #     Top1: 92.850    Top5: 99.780    Loss: 0.364
 #     Total MACs: 125,747,840
-#
+#     Total parameters: 851504
 # Results:
 #     Top1: 93.030    Top5: 99.650    Loss: 1.533
 #     Total MACs: 78,856,832
+#     Total parameters: 634640 (74.53%)
 #
 #
 # time python3 compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../network_trimming/resnet56_cifar_activation_apoz.yaml --resume-from=checkpoint.resnet56_cifar_baseline.pth.tar --reset-optimizer --act-stats=valid

--- a/examples/network_trimming/resnet56_cifar_activation_apoz_v2.yaml
+++ b/examples/network_trimming/resnet56_cifar_activation_apoz_v2.yaml
@@ -3,16 +3,16 @@
 # Compare this to examples/pruning_filters_for_efficient_convnets/resnet56_cifar_filter_rank.yaml - the pruning time is
 # much longer due to the callbacks required for collecting the activation statistics (this can be improved by disabling
 # of the detailed records collection, for example).
-# This provides 62.7% compute compression (x1.6) while increasing the Top1.
+# This provides 53.92% compute compression (x1.85).
 #
 # Baseline results:
 #     Top1: 92.850    Top5: 99.780    Loss: 0.364
 #     Total MACs: 125,747,840
-#
+#     Total parameters: 851504
 # Results:
 #     Top1: 92.590    Top5: 99.630    Loss: 1.537
 #     Total MACs: 67,797,632
-#
+#     Total parameters: 570704 (67.02%)
 #
 # time python3 compress_classifier.py -a=resnet56_cifar -p=50 ../../../data.cifar10 --epochs=70 --lr=0.1 --compress=../network_trimming/resnet56_cifar_activation_apoz_v2.yaml --resume-from=checkpoint.resnet56_cifar_baseline.pth.tar --reset-optimizer --act-stats=valid
 #

--- a/examples/pruning_filters_for_efficient_convnets/resnet56_cifar_baseline_training.yaml
+++ b/examples/pruning_filters_for_efficient_convnets/resnet56_cifar_baseline_training.yaml
-# We used this schedule to train CIFAR10-ResNet56 from scratch
-#
-# time python3 compress_classifier.py --arch resnet56_cifar  ../../../data.cifar10 -p=50 --lr=0.3 --epochs=180 --compress=../pruning_filters_for_efficient_convnets/resnet56_cifar_baseline_training.yaml -j=1 --deterministic
-#
-# Target: 6.96% error was reported Pruning Filters for Efficient Convnets
-#
-# Parameters:
-# +----+-------------------------------------+----------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------+
-# |    | Name                                | Shape          |   NNZ (dense) |   NNZ (sparse) |   Cols (%) |   Rows (%) |   Ch (%) |   2D (%) |   3D (%) |   Fine (%) |     Std |     Mean |   Abs-Mean |
-# |----+-------------------------------------+----------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------|
-# |  0 | module.conv1.weight                 | (16, 3, 3, 3)  |           432 |            432 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.39191 |  0.00826 |    0.18757 |
-# |  1 | module.layer1.0.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.08334 | -0.00180 |    0.03892 |
-# |  2 | module.layer1.0.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.08565 | -0.00033 |    0.05106 |
-# |  3 | module.layer1.1.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.08190 |  0.00082 |    0.04765 |
-# |  4 | module.layer1.1.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.08365 | -0.00600 |    0.05459 |
-# |  5 | module.layer1.2.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.09640 | -0.00182 |    0.06337 |
-# |  6 | module.layer1.2.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.09881 | -0.00400 |    0.07056 |
-# |  7 | module.layer1.3.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.13412 | -0.00416 |    0.08827 |
-# |  8 | module.layer1.3.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.12693 | -0.00271 |    0.09395 |
-# |  9 | module.layer1.4.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.12149 | -0.01105 |    0.09064 |
-# | 10 | module.layer1.4.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.11322 |  0.00333 |    0.08556 |
-# | 11 | module.layer1.5.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.12076 | -0.01164 |    0.09311 |
-# | 12 | module.layer1.5.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.11627 | -0.00355 |    0.08882 |
-# | 13 | module.layer1.6.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.12492 | -0.00637 |    0.09493 |
-# | 14 | module.layer1.6.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.11240 | -0.00837 |    0.08710 |
-# | 15 | module.layer1.7.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.13819 | -0.00735 |    0.10096 |
-# | 16 | module.layer1.7.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.11107 | -0.00293 |    0.08613 |
-# | 17 | module.layer1.8.conv1.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.12269 | -0.01133 |    0.09511 |
-# | 18 | module.layer1.8.conv2.weight        | (16, 16, 3, 3) |          2304 |           2304 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.09276 |  0.00240 |    0.07117 |
-# | 19 | module.layer2.0.conv1.weight        | (32, 16, 3, 3) |          4608 |           4608 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.13876 | -0.01190 |    0.11061 |
-# | 20 | module.layer2.0.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.12728 | -0.00499 |    0.10012 |
-# | 21 | module.layer2.0.downsample.0.weight | (32, 16, 1, 1) |           512 |            512 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.24306 | -0.01255 |    0.19073 |
-# | 22 | module.layer2.1.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.11474 | -0.00995 |    0.09044 |
-# | 23 | module.layer2.1.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.10452 | -0.00440 |    0.08196 |
-# | 24 | module.layer2.2.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.09873 | -0.00629 |    0.07833 |
-# | 25 | module.layer2.2.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.08747 | -0.00393 |    0.06891 |
-# | 26 | module.layer2.3.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.09434 | -0.00762 |    0.07469 |
-# | 27 | module.layer2.3.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.07984 | -0.00449 |    0.06271 |
-# | 28 | module.layer2.4.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.08767 | -0.00733 |    0.06852 |
-# | 29 | module.layer2.4.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.06642 | -0.00396 |    0.05196 |
-# | 30 | module.layer2.5.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.07521 | -0.00699 |    0.05799 |
-# | 31 | module.layer2.5.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.05739 | -0.00351 |    0.04334 |
-# | 32 | module.layer2.6.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.06130 | -0.00595 |    0.04791 |
-# | 33 | module.layer2.6.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.04703 | -0.00519 |    0.03527 |
-# | 34 | module.layer2.7.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.06366 | -0.00734 |    0.04806 |
-# | 35 | module.layer2.7.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.04591 | -0.00131 |    0.03282 |
-# | 36 | module.layer2.8.conv1.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.05903 | -0.00606 |    0.04555 |
-# | 37 | module.layer2.8.conv2.weight        | (32, 32, 3, 3) |          9216 |           9216 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.04344 | -0.00566 |    0.03290 |
-# | 38 | module.layer3.0.conv1.weight        | (64, 32, 3, 3) |         18432 |          18432 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.08262 |  0.00251 |    0.06520 |
-# | 39 | module.layer3.0.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.06248 |  0.00073 |    0.04578 |
-# | 40 | module.layer3.0.downsample.0.weight | (64, 32, 1, 1) |          2048 |           2048 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.12275 |  0.01139 |    0.08651 |
-# | 41 | module.layer3.1.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03438 | -0.00186 |    0.02419 |
-# | 42 | module.layer3.1.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03091 | -0.00368 |    0.02203 |
-# | 43 | module.layer3.2.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03477 | -0.00226 |    0.02499 |
-# | 44 | module.layer3.2.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03012 | -0.00350 |    0.02159 |
-# | 45 | module.layer3.3.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03577 | -0.00166 |    0.02608 |
-# | 46 | module.layer3.3.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.02962 | -0.00124 |    0.02115 |
-# | 47 | module.layer3.4.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03694 | -0.00285 |    0.02677 |
-# | 48 | module.layer3.4.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.02916 | -0.00165 |    0.02024 |
-# | 49 | module.layer3.5.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03158 | -0.00180 |    0.02342 |
-# | 50 | module.layer3.5.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.02527 | -0.00177 |    0.01787 |
-# | 51 | module.layer3.6.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03074 | -0.00169 |    0.02256 |
-# | 52 | module.layer3.6.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.02406 | -0.00006 |    0.01658 |
-# | 53 | module.layer3.7.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.03160 | -0.00249 |    0.02294 |
-# | 54 | module.layer3.7.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.02298 | -0.00083 |    0.01553 |
-# | 55 | module.layer3.8.conv1.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.02594 | -0.00219 |    0.01890 |
-# | 56 | module.layer3.8.conv2.weight        | (64, 64, 3, 3) |         36864 |          36864 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.01986 | -0.00061 |    0.01318 |
-# | 57 | module.fc.weight                    | (10, 64)       |           640 |            640 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.52562 | -0.00003 |    0.39168 |
-# | 58 | Total sparsity:                     | -              |        851504 |         851504 |    0.00000 |    0.00000 |  0.00000 |  0.00000 |  0.00000 |    0.00000 | 0.00000 |  0.00000 |    0.00000 |
-# +----+-------------------------------------+----------------+---------------+----------------+------------+------------+----------+----------+----------+------------+---------+----------+------------+
-# 2018-07-02 16:36:31,555 - Total sparsity: 0.00
-#
-# 2018-07-02 16:36:31,555 - --- validate (epoch=179)-----------
-# 2018-07-02 16:36:31,555 - 5000 samples (256 per mini-batch)
-# 2018-07-02 16:36:33,121 - ==> Top1: 91.520    Top5: 99.680    Loss: 0.387
-#
-# 2018-07-02 16:36:33,123 - Saving checkpoint to: logs/2018.07.02-152746/checkpoint.pth.tar
-# 2018-07-02 16:36:33,159 - --- test ---------------------
-# 2018-07-02 16:36:33,159 - 10000 samples (256 per mini-batch)
-# 2018-07-02 16:36:36,194 - ==> Top1: 92.850    Top5: 99.780    Loss: 0.364
-lr_schedulers:
-  training_lr:
-    class: StepLR
-    step_size: 45
-    gamma: 0.10
-policies:
-    - lr_scheduler:
-        instance_name: training_lr
-      starting_epoch: 35
-      ending_epoch: 200
-      frequency: 1