Commits · 8c5b287c9ac2e7076433eb3c6f00c78fafc2df73 · llvm / distiller

Apr 28, 2020
- LAPQ: Minor optimizations to reduce memory usage · 8c5b287c
  Guy Jacob authored 4 years ago
  
  8c5b287c
- PTQ - Enable weights clipping in Embedding modules · 245f483c
  Guy Jacob authored 4 years ago
  
  245f483c
- BN folding - do nothing if no BNs in model · fbb8d486
  Guy Jacob authored 4 years ago
  
  fbb8d486
- Update gitignore · a0b38e2d
  Guy Jacob authored 4 years ago
  
  a0b38e2d
Apr 27, 2020

Improve error message when using ActivationAPoZRankedFilterPruner · bd57f8ad
Neta Zmora authored 4 years ago
```
See issue #444
```
bd57f8ad
Fix issue #176 · fcb1ad90
Neta Zmora authored 4 years ago

fcb1ad90
Adjust Jupyter notebooks to interface change in apputils.load_data API · f764a8aa
Neta Zmora authored 4 years ago

f764a8aa

Modify image size and training for Inception Models (#425) · 9e787238

Soumendu Kumar Ghosh authored 4 years ago

* Merge pytorch 1.3 commits

This PR is a fix for issue #422.

1. ImageNet models usually use input size [batch, 3, 224, 224], but all Inception models require an input image size of [batch, 3, 299, 299].

2. Inception models have auxiliary branches which contribute to the loss only during training.  The reported classification loss only considers the main classification loss.

3. Inception_V3 normalizes the input inside the network itself.  More details can be found in @soumendukrg's PR #425 [comments](https://github.com/NervanaSystems/distiller/pull/425#issuecomment-557941736).

NOTE: Training using Inception_V3 is only possible on a single GPU as of now. This issue talks about this problem. I have checked and this problem persists in torch 1.3.0:
[inception_v3 of vision 0.3.0 does not fit in DataParallel of torch 1.1.0 #1048](https://github.com/pytorch/vision/issues/1048

)

Co-authored-by: Neta Zmora <neta.zmora@intel.com>

9e787238

Apr 22, 2020
- Update README.md · bcd51700
  Neta Zmora authored 4 years ago
  
  shorten the TOC
  bcd51700
- Update README.md · c32035a9
  Neta Zmora authored 4 years ago
  
  shorten the basic version
  c32035a9
Apr 21, 2020
- Update README.md · 9f767e4d
  Neta Zmora authored 4 years ago
  
  Add hsi-toolbox
  9f767e4d
- Update README · 73cba6f5
  Neta Zmora authored 4 years ago
  
  Added TorchFI citation
  73cba6f5
Apr 20, 2020

Add example code showing schedule specification using code. · 5b01a40c

Neta Zmora authored 4 years ago

This script shows how to specify a compression-schedule directly
using Distiller's API, instead of using a YAML specification

examples/scheduling_api/direct_api_pruning.py

5b01a40c

small tensor masking API refactoring (#499) · 68514d17

Neta Zmora authored 4 years ago

Added masking primitives:
 -mask_tensor
 -create_mask_threshold_criterion
 -create_mask_level_criterion
 -create_mask_sensitivity_criterion

 These APIs have a clearer name and communicate their
 responsibility better: create a tensor mask, based on
 some criterion.  Previously,
 distiller.pruning.create_mask_threshold_criterion was
 named distiller.threshold_mask which did not communicate
 well what this function did.
 Masking functionality is no longer hidden
 inside the Pruner instances, so they can be used directly
 by an application, or to compose new Pruner classes.

Removed file distiller.pruning.pruner:
 -The base-class _ParameterPruner is useless and adds
 needless details to the implementation.

AGP: Separated the pruning-rate schedule from the
 rest of the logic.  This allows us to mix-and-match different
 pruning-rate schedules (just like LR schedulers).

68514d17

Apr 16, 2020
- Add EltwiseSub · 748cb056
  Neta Zmora authored 4 years ago
  
  As requested in issue #496
  748cb056
Apr 13, 2020
- Update README.md · 410a059b
  Neta Zmora authored 4 years ago
  
  Added some more Distiller citations
  410a059b
- remove annoying log debug message (summary_graph.py) · ded613d2
  Neta Zmora authored 4 years ago
  
  ded613d2
- Update README.md · 45d2db44
  Neta Zmora authored 4 years ago
  
  Fix ToC Move "Built With" to the Acknowledgement section
  45d2db44
- Update README.md · c2ea77f6
  Neta Zmora authored 4 years ago
  
  Experiment with layout reformatting: shorten the Community section by making it foldable.
  c2ea77f6
- update README.md · 53e309a4
  Neta Zmora authored 4 years ago
  
  Experiment with layout reformatting: shorten the Community section by making it foldable.
  53e309a4
Apr 12, 2020
- update README.md · 1ff7627f
  Neta Zmora authored 4 years ago
  
  Added citations
  1ff7627f
- Update README.md · 0d34548a
  Neta Zmora authored 4 years ago
  
  change distiller citation as suggested in issue #492
  0d34548a
- update README · 4b3994f2
  Neta Zmora authored 4 years ago
  
  Remove warning regarding Distiller release 0.3 (breaking backward compat)
  4b3994f2
Mar 31, 2020
- EarlyExitMgr: Handle case where model was DataParallel-ed after creation · 4ad16ef0
  Guy Jacob authored 4 years ago
  
  4ad16ef0
Feb 26, 2020

Yet another, hopefully final, fix to gitpython requirements (#475) · 2291fdcc

Guy Jacob authored 5 years ago

The gitdb versioning issue is resolved internally in gitpython
3.1.0, so moving to that and removing specific gitb requirements

2291fdcc

Feb 23, 2020
- Restrict requirements back for gitpython (#475) · 07c14e70
  levzlotnik authored 5 years ago
  
  07c14e70
- Merge remote-tracking branch 'origin/master' · 09927657
  levzlotnik authored 5 years ago
  
  09927657
- Relaxed requirement for gitpython (#475). · 185b97a2
  levzlotnik authored 5 years ago
  
  185b97a2
Feb 17, 2020

Uncomment mistakenly commented line in ResNet18 PTQ LAPQ yaml · 7e0d22d2
Guy Jacob authored 5 years ago

7e0d22d2

PyTorch PTQ convert updates/fixes + Raw activations collector · ccd11ddb

Guy Jacob authored 5 years ago

* BUGFIX: Fixed wrong attribute name for zero-point in conversion
  of eltwise add/mult and concat
* Add PyTorch PTQ convert for embedding (converted to FP32
  embedding + quant op)
* Fix conversion function to work with tuple/list model inputs

ccd11ddb

Post-Train Quant LAPQ Refactoring (#473) · 394e3bc6

Guy Jacob authored 5 years ago

* Move image classification specific setup code to separate script at
  examples/classifier_compression/ptq_lapq.py
* Make ptq_coordinate_search function completely independent of
  command line arguments
* Change LAPQ command line args function to update existing
  pre-existing parser (changed CLAs perfix to 'lapq' for more clarity)
* Enable LAPQ from compress_classifier.py (trigger with --qe-lapq)
* Add pointers in documentation

394e3bc6

Feb 13, 2020
- Quantization misc. fixes · 6dfa8747
  Guy Jacob authored 5 years ago
  
  6dfa8747
Feb 09, 2020
- Update _Regularizer docstrings for clarity of API · 8cffe6c9
  levzlotnik authored 5 years ago
  
  8cffe6c9
Feb 06, 2020

Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f

Guy Jacob authored 5 years ago

Convert Distiller PTQ models to "native" PyTorch PTQ (#458)

* New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
* Can also be called from PostTrainLinearQuantizer instance:
    quantizer.convert_to_pytorch()
* Can also trigger from command line in image classification sample
* Can save/load converted modules via apputils.load/save_checkpoint
* Added Jupyter notebook tutorial

* Converted modules have only the absolutely necessary quant-dequant
  operations. For a fully quantized model, this means just quantization
  of model input and de-quantization of model output. If a user keeps
  specific internal layers in FP32, quant-dequant operations are added
  as needed
* Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
  take care of preventing overflows (aka "reduce_range" in the PyTorch
  API)

cdc1775f

Feb 03, 2020
- Fix path in LAPQ yaml · bf18da16
  Guy Jacob authored 5 years ago
  
  bf18da16
Feb 02, 2020

Loss Aware Post Train Quantization search (#432) · 0b493fd3

Lev Zlotnik authored 5 years ago

"Loss Aware Post-Training Quantization" (Nahshan et al., 2019)

Paper: https://arxiv.org/abs/1911.07190 
Reference implementation:
  https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq

Proper documentation is still TODO, for now see the example YAML file
at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'

* Implemented in distiller/quantization/ptq_coordinate_search.py
* At the moment that file both the model-independent algorithm
  implementation and image-classification specific sample script.
  Still TODO: Refactor that

* Post train quantization changes (range_linear):
  * Added getters/setters for quantization parameters (scale/zero_point)
    and clipping values
  * Add option to save backup of FP32 weights to allow re-quantization
    after quantizer was created.
  * Add option to clip weights in addition to activations
  * Fix fusions to not occur only when activations aren't quantized
  * RangeLinearFakeQuantWrapper:
    * Make inputs quantization optional
    * In case of ReLU + ACIQ, clip according to input stats

* Data loaders:
  * Add option to not load train set at all from disk (to speed up
    loading time in post-training runs)
  * Modified "image_classifier.py" accordingly

0b493fd3

Jan 19, 2020

Fix requirements.txt - due to torchvision/pillow mismatch · db597222

Neta Zmora authored 5 years ago

Temp patch until moving to torchvision 0.5.
See https://github.com/pytorch/vision/issues/1712#issuecomment-575036523

db597222

Jan 18, 2020

ranked_structures_pruner.py: fix error message · 1c4c3b6d

Neta Zmora authored 5 years ago

Fix the formatting of a ValueError raised when a module is missing
an attribute, when collecting activation statistics.

1c4c3b6d

Jan 15, 2020

Fix scale factor calculation in symmetric quantization (#463) · 78255ee0

Guy Jacob authored 5 years ago

(we use 8-bit values below, but this applies to any bit-width)
* We use the notion of "full" and "restricted" quantized range for
  symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
* "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
* Until now, when doing symmetric quantization we assumed a "full"
  range when saturating after quantization, but calculated the scale
  factor as if the range was restricted. This means we weren't making
  full utilization of the quantized range.
* On the other hand, in some other implementations of quantization (e.g.
  TensorFlow), the "restricted" range is used.
* So, we make it an option to use either the proper "full" range
  (q_min = -128) or "restricted" range (q_min = -127).
* LinearQuantMode.SYMMETRIC now means the "full" range is used, and
  added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
  range.
* Updated tests and documentation.

78255ee0

Jan 06, 2020

Post-train quant: Refactor inputs quantization (#454) · e82d9380

Guy Jacob authored 5 years ago

* Fake quant wrapper now also works on (fake) quantized inputs
* Remove 'requires_quantized_inputs' flag
* Unrelated: Moved LinearQuantMode enum to q_utils

e82d9380