Element-wise pruning can create very sparse models which can be compressed to consume less memory footprint and bandwidth, but without specialized hardware that can compute using the sparse representation of the tensors, we don't gain any speedup of the computation. Structure pruners, remove entire "structures", such as kernels, filters, and even entire feature-maps.
Element-wise pruning can create very sparse models which can be compressed to consume less memory footprint and bandwidth, but without specialized hardware that can compute using the sparse representation of the tensors, we don't gain any speedup of the computation. Structure pruners, remove entire "structures", such as kernels, filters, and even entire feature-maps.
@@ -274,7 +274,7 @@ abundant and gradually reduce the number of weights being pruned each time as th
...
@@ -274,7 +274,7 @@ abundant and gradually reduce the number of weights being pruned each time as th
<h2id="rnn-pruner">RNN pruner</h2>
<h2id="rnn-pruner">RNN pruner</h2>
<p>The authors of <ahref="https://arxiv.org/abs/1704.05119">Exploring Sparsity in Recurrent Neural Networks</a>, Sharan Narang, Erich Elsen, Gregory Diamos, and Shubho Sengupta, "propose a technique to reduce the parameters of a network by pruning weights during the initial training of the network." They use a gradual pruning schedule which is reminiscent of the schedule used in AGP, for element-wise pruning of RNNs, which they also employ during training. They show pruning of RNN, GRU, LSTM and embedding layers.</p>
<p>The authors of <ahref="https://arxiv.org/abs/1704.05119">Exploring Sparsity in Recurrent Neural Networks</a>, Sharan Narang, Erich Elsen, Gregory Diamos, and Shubho Sengupta, "propose a technique to reduce the parameters of a network by pruning weights during the initial training of the network." They use a gradual pruning schedule which is reminiscent of the schedule used in AGP, for element-wise pruning of RNNs, which they also employ during training. They show pruning of RNN, GRU, LSTM and embedding layers.</p>
<p>Distiller's distiller.pruning.BaiduRNNPruner class implements this pruning algorithm.</p>
<p>Distiller's distiller.pruning.BaiduRNNPruner class implements this pruning algorithm.</p>
<p>Element-wise pruning can create very sparse models which can be compressed to consume less memory footprint and bandwidth, but without specialized hardware that can compute using the sparse representation of the tensors, we don't gain any speedup of the computation. Structure pruners, remove entire "structures", such as kernels, filters, and even entire feature-maps.</p>
<p>Element-wise pruning can create very sparse models which can be compressed to consume less memory footprint and bandwidth, but without specialized hardware that can compute using the sparse representation of the tensors, we don't gain any speedup of the computation. Structure pruners, remove entire "structures", such as kernels, filters, and even entire feature-maps.</p>