docs-src/docs/usage.md · 22e3ea8b67a13764fbb5bfd206fd0831f10a6eb2 · llvm / distiller

"...git@gitlab.engr.illinois.edu:llvm/hpvm-release.git" did not exist on "77fab8ab462394ba8056d14bebd52860bf82b889"

6 years ago

Fix activation stats for Linear layers · 22e3ea8b

Neta Zmora authored 6 years ago

Thanks to Dan Alistarh for bringing this issue to my attention.
The activations of Linear layers have shape (batch_size, output_size) and those
of Convolution layers have shape (batch_size, num_channels, width, height) and
this distinction in shape was not correctly handled.

This commit also fixes sparsity computation for very large activations, as seen
in VGG16, which leads to memory exhaustion.  One solution is to use smaller
batch sizes, but this commit uses a different solution, which counts zeros “manually”,
and using less space.

Also in this commit:
- Added a “caveats” section to the documentation.
- Added more tests.

22e3ea8b

History

Fix activation stats for Linear layers

Neta Zmora authored 6 years ago

Thanks to Dan Alistarh for bringing this issue to my attention.
The activations of Linear layers have shape (batch_size, output_size) and those
of Convolution layers have shape (batch_size, num_channels, width, height) and
this distinction in shape was not correctly handled.

This commit also fixes sparsity computation for very large activations, as seen
in VGG16, which leads to memory exhaustion.  One solution is to use smaller
batch sizes, but this commit uses a different solution, which counts zeros “manually”,
and using less space.

Also in this commit:
- Added a “caveats” section to the documentation.
- Added more tests.