Commits · 7e0d22d23c1fbf74e8cd99ff2660a87124ba56be · llvm / distiller

Feb 17, 2020

Uncomment mistakenly commented line in ResNet18 PTQ LAPQ yaml · 7e0d22d2
Guy Jacob authored 5 years ago

7e0d22d2

PyTorch PTQ convert updates/fixes + Raw activations collector · ccd11ddb

Guy Jacob authored 5 years ago

* BUGFIX: Fixed wrong attribute name for zero-point in conversion
  of eltwise add/mult and concat
* Add PyTorch PTQ convert for embedding (converted to FP32
  embedding + quant op)
* Fix conversion function to work with tuple/list model inputs

ccd11ddb

Post-Train Quant LAPQ Refactoring (#473) · 394e3bc6

Guy Jacob authored 5 years ago

* Move image classification specific setup code to separate script at
  examples/classifier_compression/ptq_lapq.py
* Make ptq_coordinate_search function completely independent of
  command line arguments
* Change LAPQ command line args function to update existing
  pre-existing parser (changed CLAs perfix to 'lapq' for more clarity)
* Enable LAPQ from compress_classifier.py (trigger with --qe-lapq)
* Add pointers in documentation

394e3bc6

Feb 13, 2020
- Quantization misc. fixes · 6dfa8747
  Guy Jacob authored 5 years ago
  
  6dfa8747
Feb 09, 2020
- Update _Regularizer docstrings for clarity of API · 8cffe6c9
  levzlotnik authored 5 years ago
  
  8cffe6c9
Feb 06, 2020

Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f

Guy Jacob authored 5 years ago

Convert Distiller PTQ models to "native" PyTorch PTQ (#458)

* New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
* Can also be called from PostTrainLinearQuantizer instance:
    quantizer.convert_to_pytorch()
* Can also trigger from command line in image classification sample
* Can save/load converted modules via apputils.load/save_checkpoint
* Added Jupyter notebook tutorial

* Converted modules have only the absolutely necessary quant-dequant
  operations. For a fully quantized model, this means just quantization
  of model input and de-quantization of model output. If a user keeps
  specific internal layers in FP32, quant-dequant operations are added
  as needed
* Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
  take care of preventing overflows (aka "reduce_range" in the PyTorch
  API)

cdc1775f

Feb 03, 2020
- Fix path in LAPQ yaml · bf18da16
  Guy Jacob authored 5 years ago
  
  bf18da16
Feb 02, 2020

Loss Aware Post Train Quantization search (#432) · 0b493fd3

Lev Zlotnik authored 5 years ago

"Loss Aware Post-Training Quantization" (Nahshan et al., 2019)

Paper: https://arxiv.org/abs/1911.07190 
Reference implementation:
  https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq

Proper documentation is still TODO, for now see the example YAML file
at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'

* Implemented in distiller/quantization/ptq_coordinate_search.py
* At the moment that file both the model-independent algorithm
  implementation and image-classification specific sample script.
  Still TODO: Refactor that

* Post train quantization changes (range_linear):
  * Added getters/setters for quantization parameters (scale/zero_point)
    and clipping values
  * Add option to save backup of FP32 weights to allow re-quantization
    after quantizer was created.
  * Add option to clip weights in addition to activations
  * Fix fusions to not occur only when activations aren't quantized
  * RangeLinearFakeQuantWrapper:
    * Make inputs quantization optional
    * In case of ReLU + ACIQ, clip according to input stats

* Data loaders:
  * Add option to not load train set at all from disk (to speed up
    loading time in post-training runs)
  * Modified "image_classifier.py" accordingly

0b493fd3

Jan 19, 2020

Fix requirements.txt - due to torchvision/pillow mismatch · db597222

Neta Zmora authored 5 years ago

Temp patch until moving to torchvision 0.5.
See https://github.com/pytorch/vision/issues/1712#issuecomment-575036523

db597222

Jan 18, 2020

ranked_structures_pruner.py: fix error message · 1c4c3b6d

Neta Zmora authored 5 years ago

Fix the formatting of a ValueError raised when a module is missing
an attribute, when collecting activation statistics.

1c4c3b6d

Jan 15, 2020

Fix scale factor calculation in symmetric quantization (#463) · 78255ee0

Guy Jacob authored 5 years ago

(we use 8-bit values below, but this applies to any bit-width)
* We use the notion of "full" and "restricted" quantized range for
  symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
* "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
* Until now, when doing symmetric quantization we assumed a "full"
  range when saturating after quantization, but calculated the scale
  factor as if the range was restricted. This means we weren't making
  full utilization of the quantized range.
* On the other hand, in some other implementations of quantization (e.g.
  TensorFlow), the "restricted" range is used.
* So, we make it an option to use either the proper "full" range
  (q_min = -128) or "restricted" range (q_min = -127).
* LinearQuantMode.SYMMETRIC now means the "full" range is used, and
  added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
  range.
* Updated tests and documentation.

78255ee0

Jan 06, 2020

Post-train quant: Refactor inputs quantization (#454) · e82d9380

Guy Jacob authored 5 years ago

* Fake quant wrapper now also works on (fake) quantized inputs
* Remove 'requires_quantized_inputs' flag
* Unrelated: Moved LinearQuantMode enum to q_utils

e82d9380

Dec 30, 2019
- Separate LinearQuantMode for weights/activations (#451) · 47175961
  Guy Jacob authored 5 years ago
  
  In PostTrainLinearQuantizer and QuantAwareTrainRangeLinearQuantizer
  47175961
Dec 29, 2019
- Add Mobilenet v1 baseline training script · 012417a5
  Neta Zmora authored 5 years ago
  
  012417a5
Dec 26, 2019
- Fix broken links in image classification sample readme · b2dc35ba
  Guy Jacob authored 5 years ago
  
  b2dc35ba
Dec 18, 2019

IFM sparsity collector (#443) · cc50035e

Bar authored 5 years ago

Add directionality to SummaryActivationStatsCollector to allow collection of statistics on incoming and outgoing activations/feature-maps; instead of just outgoing activations.

Also includes some code refactoring.

cc50035e

Dec 12, 2019
- Bugfix - missing argument in post-quant training unitest (thanks @barrh) · b3e80cb1
  Guy Jacob authored 5 years ago
  
  b3e80cb1
Dec 11, 2019
- Add support for wide-resnet models that exist in torchvision · 8a272306
  Guy Jacob authored 5 years ago
  
  8a272306
- Quantization stats collection minor updates · 10cd1a85
  Guy Jacob authored 5 years ago
  
  * Limit batch size to 128 when initiating from image classification app * Don't raise inplace error in case of Dropout module
  10cd1a85
Dec 09, 2019
- Update examples README · 17df7c44
  Guy Jacob authored 5 years ago
  
  17df7c44
- Update README · 2a0e2b6d
  Guy Jacob authored 5 years ago
  
  2a0e2b6d
- Update Examples Documentation (#441) · b8f34117
  Guy Jacob authored 5 years ago
  
  * Make it easier to find sample apps for different workload types * Add READMEs for sample apps the didn't have any * Update readmes with experiment results where applicable
  b8f34117
- Updated README.md in object_detection · 830aa356
  Lev Zlotnik authored 5 years ago
  
  Added tables of results for 85% sparsity
  830aa356
Dec 08, 2019

Enable weights/activations-only PTQ for conv/linear modules (#439) · 952028d0

Guy Jacob authored 5 years ago

* Weights-only PTQ:
  * Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in
    which case it'll act as a simple pass-through during forward
  * In RangeLinearQuantParamLayerWrapper, if bits_activations is None
    and num_bits_params > 0, Perform quant and de-quant of the
    parameters instead of just quant.
* Activations-only PTQ:
  * Enable activations only quantization for conv/linear modules. When
    PostTrainLinearQuantizer detects # bits != None for activations 
    and # bits == None for weights, a fake-quantization wrapper will
    be used.
* Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command
  line arguments to invoke weights/activations-only quantization,
  respectively.
* Minor refactoring for clarity in PostTrainLinearQuantizer's replace_*
  functions

952028d0

Update PTQ ResNet-50 command line results · 326d172f
Guy Jacob authored 5 years ago
```
Results changed following commit 9e7ef987 (#402)
```
326d172f

Dec 03, 2019
- Bugfix - Save stats to file when 'qe-calibration' arg is used (#437) · 3df06fb4
  SunYiran authored 5 years ago
  
  3df06fb4
Dec 02, 2019

object detection: remove unsupported summaries · 8e3f04cc
Neta Zmora authored 5 years ago
```
compute-summary and png-summary currently work with image classifiers
only.
```
8e3f04cc

object detection: fix model summary generation · 8002b13f

Neta Zmora authored 5 years ago

When multi-processing, we want only one process to generate the
summary, while the other processes do nothing (lazy bums!)

8002b13f

Hotfix to sim_bn_fold.py after breaking merge · 801a26a8
levzlotnik authored 5 years ago

801a26a8

Object Detection Compression (#343) · 697b3cfe

Lev Zlotnik authored 5 years ago

Add an example of compressing OD pytorch models.

In this example we compress torchvision's object detection models - FasterRCNN / MaskRCNN / KeypointRCNN.
We've modified the reference code for object detection to allow easy compression scheduling with YAML configuration.

697b3cfe

Added model_setattr - sets an parameter/buffer/module · 6d9afab5
levzlotnik authored 5 years ago
```
of a model by name relative to the root of the model.
```
6d9afab5

Nov 28, 2019

Merge branch 'master' of https://github.com/NervanaSystems/distiller · 0520ffaf
Neta Zmora authored 5 years ago

0520ffaf

AMC: fix problems reported in issue #429 · 47af2cfa

Neta Zmora authored 5 years ago

- define ALMOST_ONE
- define op_type
- remove sanity assert (need to understand what tolerance value to use
in the assert)

Co-authored-by: csc12138
Co-authored-by: wangyidong3

47af2cfa

AMC: fix problems reported in issue #429 · 48244f3b

Neta Zmora authored 5 years ago

- define ALMOST_ONE
- define op_type
- remove sanity assert (need to understand what tolerance value to use
in the assert)

48244f3b

Nov 27, 2019

Introduce a performance tracking class (#427) · f45a29da

Neta Zmora authored 5 years ago

This will help define and use different performance sorting schemes.
E.g. this will address the issue raised in issue #411

f45a29da

add tolerance (=0.05) when comparing test results · 1cf7e529

Neta Zmora authored 5 years ago

Small variances can occur when using different cudnn versions,
even when the environment and distiller version is the same.

1cf7e529

Revert "Cifar models: remove explicit parameters initialization" · f92656c7

Neta Zmora authored 5 years ago

Said commit was wrong: the default inializations in pytorch are not
the same as in our code.  For example, the default convolution
weight initialization uses Kaiming-uniform, while we used
Kaiming-normal.
For backward comparability of the model behavior, we need to
revert to the old behavior.
This reverts commit 6913687f.

f92656c7

Nov 25, 2019

Resnet50 early-exit update · 8b341593

Neta Zmora authored 5 years ago

Update the definition of the exits using info from Haim.

This is still very unsatsifactory because we don't have working
examples to show users :-(

8b341593

Nov 17, 2019
- Add README.md files for some pruning examples · 0c175c94
  Neta Zmora authored 5 years ago
  
  0c175c94
Nov 16, 2019
- Add README.md files for APG and DropFilter · 70e26735
  Neta Zmora authored 5 years ago
  
  70e26735