Commits · 185b97a24baba16744514a4fb1359e09b454b1dd · llvm / distiller

Feb 23, 2020
- Relaxed requirement for gitpython (#475). · 185b97a2
  levzlotnik authored 5 years ago
  
  185b97a2
Feb 09, 2020
- Update _Regularizer docstrings for clarity of API · 8cffe6c9
  levzlotnik authored 5 years ago
  
  8cffe6c9
Feb 06, 2020

Convert Distiller PTQ models to "native" PyTorch PTQ (#458) · cdc1775f

Guy Jacob authored 5 years ago

Convert Distiller PTQ models to "native" PyTorch PTQ (#458)

* New API: distiller.quantization.convert_distiller_ptq_model_to_pytorch()
* Can also be called from PostTrainLinearQuantizer instance:
    quantizer.convert_to_pytorch()
* Can also trigger from command line in image classification sample
* Can save/load converted modules via apputils.load/save_checkpoint
* Added Jupyter notebook tutorial

* Converted modules have only the absolutely necessary quant-dequant
  operations. For a fully quantized model, this means just quantization
  of model input and de-quantization of model output. If a user keeps
  specific internal layers in FP32, quant-dequant operations are added
  as needed
* Can configure either 'fbgemm' or 'qnnpack' backend. For 'fbgemm' we
  take care of preventing overflows (aka "reduce_range" in the PyTorch
  API)

cdc1775f

Feb 03, 2020
- Fix path in LAPQ yaml · bf18da16
  Guy Jacob authored 5 years ago
  
  bf18da16
Feb 02, 2020

Loss Aware Post Train Quantization search (#432) · 0b493fd3

Lev Zlotnik authored 5 years ago

"Loss Aware Post-Training Quantization" (Nahshan et al., 2019)

Paper: https://arxiv.org/abs/1911.07190 
Reference implementation:
  https://github.com/ynahshan/nn-quantization-pytorch/tree/master/lapq

Proper documentation is still TODO, for now see the example YAML file
at 'examples/quantization/post_train_quant/resnet18_imagenet_post_train_lapq.yaml'

* Implemented in distiller/quantization/ptq_coordinate_search.py
* At the moment that file both the model-independent algorithm
  implementation and image-classification specific sample script.
  Still TODO: Refactor that

* Post train quantization changes (range_linear):
  * Added getters/setters for quantization parameters (scale/zero_point)
    and clipping values
  * Add option to save backup of FP32 weights to allow re-quantization
    after quantizer was created.
  * Add option to clip weights in addition to activations
  * Fix fusions to not occur only when activations aren't quantized
  * RangeLinearFakeQuantWrapper:
    * Make inputs quantization optional
    * In case of ReLU + ACIQ, clip according to input stats

* Data loaders:
  * Add option to not load train set at all from disk (to speed up
    loading time in post-training runs)
  * Modified "image_classifier.py" accordingly

0b493fd3

Jan 19, 2020

Fix requirements.txt - due to torchvision/pillow mismatch · db597222

Neta Zmora authored 5 years ago

Temp patch until moving to torchvision 0.5.
See https://github.com/pytorch/vision/issues/1712#issuecomment-575036523

db597222

Jan 18, 2020

ranked_structures_pruner.py: fix error message · 1c4c3b6d

Neta Zmora authored 5 years ago

Fix the formatting of a ValueError raised when a module is missing
an attribute, when collecting activation statistics.

1c4c3b6d

Jan 15, 2020

Fix scale factor calculation in symmetric quantization (#463) · 78255ee0

Guy Jacob authored 5 years ago

(we use 8-bit values below, but this applies to any bit-width)
* We use the notion of "full" and "restricted" quantized range for
  symmetric quantization (see section 2.2 in https://arxiv.org/abs/1806.08342)
* "Full" quantized range ==> [-128, 127], "restircted" ==> [-127, 127]
* Until now, when doing symmetric quantization we assumed a "full"
  range when saturating after quantization, but calculated the scale
  factor as if the range was restricted. This means we weren't making
  full utilization of the quantized range.
* On the other hand, in some other implementations of quantization (e.g.
  TensorFlow), the "restricted" range is used.
* So, we make it an option to use either the proper "full" range
  (q_min = -128) or "restricted" range (q_min = -127).
* LinearQuantMode.SYMMETRIC now means the "full" range is used, and
  added LinearQuantMode.SYMMETRIC_RESTRICTED for using the "restricted"
  range.
* Updated tests and documentation.

78255ee0

Jan 06, 2020

Post-train quant: Refactor inputs quantization (#454) · e82d9380

Guy Jacob authored 5 years ago

* Fake quant wrapper now also works on (fake) quantized inputs
* Remove 'requires_quantized_inputs' flag
* Unrelated: Moved LinearQuantMode enum to q_utils

e82d9380

Dec 30, 2019
- Separate LinearQuantMode for weights/activations (#451) · 47175961
  Guy Jacob authored 5 years ago
  
  In PostTrainLinearQuantizer and QuantAwareTrainRangeLinearQuantizer
  47175961
Dec 29, 2019
- Add Mobilenet v1 baseline training script · 012417a5
  Neta Zmora authored 5 years ago
  
  012417a5
Dec 26, 2019
- Fix broken links in image classification sample readme · b2dc35ba
  Guy Jacob authored 5 years ago
  
  b2dc35ba
Dec 18, 2019

IFM sparsity collector (#443) · cc50035e

Bar authored 5 years ago

Add directionality to SummaryActivationStatsCollector to allow collection of statistics on incoming and outgoing activations/feature-maps; instead of just outgoing activations.

Also includes some code refactoring.

cc50035e

Dec 12, 2019
- Bugfix - missing argument in post-quant training unitest (thanks @barrh) · b3e80cb1
  Guy Jacob authored 5 years ago
  
  b3e80cb1
Dec 11, 2019
- Add support for wide-resnet models that exist in torchvision · 8a272306
  Guy Jacob authored 5 years ago
  
  8a272306
- Quantization stats collection minor updates · 10cd1a85
  Guy Jacob authored 5 years ago
  
  * Limit batch size to 128 when initiating from image classification app * Don't raise inplace error in case of Dropout module
  10cd1a85
Dec 09, 2019
- Update examples README · 17df7c44
  Guy Jacob authored 5 years ago
  
  17df7c44
- Update README · 2a0e2b6d
  Guy Jacob authored 5 years ago
  
  2a0e2b6d
- Update Examples Documentation (#441) · b8f34117
  Guy Jacob authored 5 years ago
  
  * Make it easier to find sample apps for different workload types * Add READMEs for sample apps the didn't have any * Update readmes with experiment results where applicable
  b8f34117
- Updated README.md in object_detection · 830aa356
  Lev Zlotnik authored 5 years ago
  
  Added tables of results for 85% sparsity
  830aa356
Dec 08, 2019

Enable weights/activations-only PTQ for conv/linear modules (#439) · 952028d0

Guy Jacob authored 5 years ago

* Weights-only PTQ:
  * Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in
    which case it'll act as a simple pass-through during forward
  * In RangeLinearQuantParamLayerWrapper, if bits_activations is None
    and num_bits_params > 0, Perform quant and de-quant of the
    parameters instead of just quant.
* Activations-only PTQ:
  * Enable activations only quantization for conv/linear modules. When
    PostTrainLinearQuantizer detects # bits != None for activations 
    and # bits == None for weights, a fake-quantization wrapper will
    be used.
* Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command
  line arguments to invoke weights/activations-only quantization,
  respectively.
* Minor refactoring for clarity in PostTrainLinearQuantizer's replace_*
  functions

952028d0

Update PTQ ResNet-50 command line results · 326d172f
Guy Jacob authored 5 years ago
```
Results changed following commit 9e7ef987 (#402)
```
326d172f

Dec 03, 2019
- Bugfix - Save stats to file when 'qe-calibration' arg is used (#437) · 3df06fb4
  SunYiran authored 5 years ago
  
  3df06fb4
Dec 02, 2019

object detection: remove unsupported summaries · 8e3f04cc
Neta Zmora authored 5 years ago
```
compute-summary and png-summary currently work with image classifiers
only.
```
8e3f04cc

object detection: fix model summary generation · 8002b13f

Neta Zmora authored 5 years ago

When multi-processing, we want only one process to generate the
summary, while the other processes do nothing (lazy bums!)

8002b13f

Hotfix to sim_bn_fold.py after breaking merge · 801a26a8
levzlotnik authored 5 years ago

801a26a8

Object Detection Compression (#343) · 697b3cfe

Lev Zlotnik authored 5 years ago

Add an example of compressing OD pytorch models.

In this example we compress torchvision's object detection models - FasterRCNN / MaskRCNN / KeypointRCNN.
We've modified the reference code for object detection to allow easy compression scheduling with YAML configuration.

697b3cfe

Added model_setattr - sets an parameter/buffer/module · 6d9afab5
levzlotnik authored 5 years ago
```
of a model by name relative to the root of the model.
```
6d9afab5

Nov 28, 2019

Merge branch 'master' of https://github.com/NervanaSystems/distiller · 0520ffaf
Neta Zmora authored 5 years ago

0520ffaf

AMC: fix problems reported in issue #429 · 47af2cfa

Neta Zmora authored 5 years ago

- define ALMOST_ONE
- define op_type
- remove sanity assert (need to understand what tolerance value to use
in the assert)

Co-authored-by: csc12138
Co-authored-by: wangyidong3

47af2cfa

AMC: fix problems reported in issue #429 · 48244f3b

Neta Zmora authored 5 years ago

- define ALMOST_ONE
- define op_type
- remove sanity assert (need to understand what tolerance value to use
in the assert)

48244f3b

Nov 27, 2019

Introduce a performance tracking class (#427) · f45a29da

Neta Zmora authored 5 years ago

This will help define and use different performance sorting schemes.
E.g. this will address the issue raised in issue #411

f45a29da

add tolerance (=0.05) when comparing test results · 1cf7e529

Neta Zmora authored 5 years ago

Small variances can occur when using different cudnn versions,
even when the environment and distiller version is the same.

1cf7e529

Revert "Cifar models: remove explicit parameters initialization" · f92656c7

Neta Zmora authored 5 years ago

Said commit was wrong: the default inializations in pytorch are not
the same as in our code.  For example, the default convolution
weight initialization uses Kaiming-uniform, while we used
Kaiming-normal.
For backward comparability of the model behavior, we need to
revert to the old behavior.
This reverts commit 6913687f.

f92656c7

Nov 25, 2019

Resnet50 early-exit update · 8b341593

Neta Zmora authored 5 years ago

Update the definition of the exits using info from Haim.

This is still very unsatsifactory because we don't have working
examples to show users :-(

8b341593

Nov 17, 2019
- Add README.md files for some pruning examples · 0c175c94
  Neta Zmora authored 5 years ago
  
  0c175c94
Nov 16, 2019

Add README.md files for APG and DropFilter · 70e26735
Neta Zmora authored 5 years ago

70e26735
Remove duplicate YAML file · 49933144
Neta Zmora authored 5 years ago

49933144

Cifar models: remove explicit parameters initialization · 6913687f

Neta Zmora authored 5 years ago

except for the case of VGG, our parameter initialization code was
matched the default pytorch initialization (per torch.nn operation),
so writing the initialization code ourselves can only lead to
more code and maintenance; and also we would not benefit from
improvements that occur at the pytorch level (e.g. if FB finds a
better initialization for nn.conv2d than today's kaiming init, we
would not benefit).
The VGG initialization we had was "suspicious" and so reverting
to the default seems reasonable.

6913687f

Nov 14, 2019
- Update required PyTorch version check · fbdbe35a
  Guy Jacob authored 5 years ago
  
  fbdbe35a