-
- Downloads
Fix activation stats for Linear layers
Thanks to Dan Alistarh for bringing this issue to my attention. The activations of Linear layers have shape (batch_size, output_size) and those of Convolution layers have shape (batch_size, num_channels, width, height) and this distinction in shape was not correctly handled. This commit also fixes sparsity computation for very large activations, as seen in VGG16, which leads to memory exhaustion. One solution is to use smaller batch sizes, but this commit uses a different solution, which counts zeros “manually”, and using less space. Also in this commit: - Added a “caveats” section to the documentation. - Added more tests.
Showing
- distiller/data_loggers/collector.py 12 additions, 8 deletionsdistiller/data_loggers/collector.py
- distiller/utils.py 33 additions, 19 deletionsdistiller/utils.py
- docs-src/docs/usage.md 91 additions, 1 deletiondocs-src/docs/usage.md
- docs/index.html 1 addition, 1 deletiondocs/index.html
- docs/search/search_index.json 8 additions, 3 deletionsdocs/search/search_index.json
- docs/sitemap.xml 15 additions, 15 deletionsdocs/sitemap.xml
- docs/usage/index.html 92 additions, 2 deletionsdocs/usage/index.html
- tests/test_basic.py 39 additions, 4 deletionstests/test_basic.py
Loading
Please register or sign in to comment