Documentation: added a bit of info regarding Baidu's RNN pruning algorithm

ce60fc14 · Neta Zmora · be97de23 · ce60fc14 · ce60fc14 · ce60fc14
Commit ce60fc14 authored 7 years ago by Neta Zmora
--- a/docs-src/docs/algo_pruning.md
+++ b/docs-src/docs/algo_pruning.md
@@ -103,9 +103,11 @@ The authors describe AGP:
 - Does not make any assumptions about the structure of the network or its constituent layers, and is therefore more generally applicable.

 ## RNN pruner
- <b>Reference:</b> [Exploring Sparsity in Recurrent Neural Networks](https://arxiv.org/abs/1704.05119)
- <b>Authors:</b> Sharan Narang, Erich Elsen, Gregory Diamos, Shubho Sengupta
- <b>Status: not implemented</b><br>
+The authors of [Exploring Sparsity in Recurrent Neural Networks](https://arxiv.org/abs/1704.05119), Sharan Narang, Erich Elsen, Gregory Diamos, and Shubho Sengupta, "propose a technique to reduce the parameters of a network by pruning weights during the initial training of the network."  They use a gradual pruning schedule which is reminiscent of the schedule used in AGP, for element-wise pruning of RNNs, which they also employ during training.  They show pruning of RNN, GRU, LSTM and embedding layers.
+
+Distiller's distiller.pruning.BaiduRNNPruner class implements this pruning algorithm.
+
+<center>![Gradual Pruning](imgs/baidu_rnn_pruning.png)</center>

 # Structure pruners
 Element-wise pruning can create very sparse models which can be compressed to consume less memory footprint and bandwidth, but without specialized hardware that can compute using the sparse representation of the tensors, we don't gain any speedup of the computation.  Structure pruners, remove entire "structures", such as kernels, filters, and even entire feature-maps.

--- a/docs/algo_pruning/index.html
+++ b/docs/algo_pruning/index.html
@@ -272,11 +272,9 @@ abundant and gradually reduce the number of weights being pruned each time as th
 </ul>
 </blockquote>
 <h2 id="rnn-pruner">RNN pruner</h2>
-<ul>
-<li><b>Reference:</b> <a href="https://arxiv.org/abs/1704.05119">Exploring Sparsity in Recurrent Neural Networks</a></li>
-<li><b>Authors:</b> Sharan Narang, Erich Elsen, Gregory Diamos, Shubho Sengupta</li>
-<li><b>Status: not implemented</b><br></li>
-</ul>
+<p>The authors of <a href="https://arxiv.org/abs/1704.05119">Exploring Sparsity in Recurrent Neural Networks</a>, Sharan Narang, Erich Elsen, Gregory Diamos, and Shubho Sengupta, "propose a technique to reduce the parameters of a network by pruning weights during the initial training of the network."  They use a gradual pruning schedule which is reminiscent of the schedule used in AGP, for element-wise pruning of RNNs, which they also employ during training.  They show pruning of RNN, GRU, LSTM and embedding layers.</p>
+<p>Distiller's distiller.pruning.BaiduRNNPruner class implements this pruning algorithm.</p>
+<p><center><img alt="Gradual Pruning" src="../imgs/baidu_rnn_pruning.png" /></center></p>
 <h1 id="structure-pruners">Structure pruners</h1>
 <p>Element-wise pruning can create very sparse models which can be compressed to consume less memory footprint and bandwidth, but without specialized hardware that can compute using the sparse representation of the tensors, we don't gain any speedup of the computation.  Structure pruners, remove entire "structures", such as kernels, filters, and even entire feature-maps.</p>
 <h2 id="ranked-structure-pruner">Ranked structure pruner</h2>

--- a/docs/index.html
+++ b/docs/index.html
@@ -246,5 +246,5 @@ And of course, if we used a sparse or compressed representation, then we are red

 <!--
 MkDocs version : 0.17.2
-Build Date UTC : 2018-05-22 09:40:34
+Build Date UTC : 2018-06-14 10:51:56
 -->
--- a/docs/search/search_index.json
+++ b/docs/search/search_index.json
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -4,7 +4,7 @@
    
    <url>
     <loc>/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
    
@@ -12,7 +12,7 @@
    
    <url>
     <loc>/install/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
    
@@ -20,7 +20,7 @@
    
    <url>
     <loc>/usage/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
    
@@ -28,7 +28,7 @@
    
    <url>
     <loc>/schedule/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
    
@@ -37,19 +37,19 @@
        
    <url>
     <loc>/pruning/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
        
    <url>
     <loc>/regularization/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
        
    <url>
     <loc>/quantization/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
        
@@ -59,13 +59,13 @@
        
    <url>
     <loc>/algo_pruning/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
        
    <url>
     <loc>/algo_quantization/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
        
@@ -74,7 +74,7 @@
    
    <url>
     <loc>/model_zoo/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
    
@@ -82,7 +82,7 @@
    
    <url>
     <loc>/jupyter/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>
    
@@ -90,7 +90,7 @@
    
    <url>
     <loc>/design/index.html</loc>
-     <lastmod>2018-05-22</lastmod>
+     <lastmod>2018-06-14</lastmod>
     <changefreq>daily</changefreq>
    </url>