diff --git a/docs-src/docs/algo_quantization.md b/docs-src/docs/algo_quantization.md
index 964d90fd9409a7d29ff847e7b7d211eef0c32bd3..00bf2fdef1133b855e3e90ce08bc8bc199930261 100644
--- a/docs-src/docs/algo_quantization.md
+++ b/docs-src/docs/algo_quantization.md
@@ -60,8 +60,8 @@ Let us denote the original floating-point tensor by \(x_f\), the quantized tenso
 (The \(round\) operation is round-to-nearest-integer)  
   
 Let's see how a **convolution** or **fully-connected (FC)** layer is quantized using this method: (we denote input, output, weights and bias with  \(x, y, w\) and \(b\) respectively)
-\[y_f = \sum{x_f w_f} + b_f = \sum{\frac{x_q}{q_x} \frac{w_q}{q_w}} + \frac{b_q}{q_b} = \frac{1}{q_x q_w} \sum{ \left( x_q w_q + \frac{q_b}{q_x q_w}b_q \right) }\]
-\[y_q = round(q_y y_f) = round\left(\frac{q_y}{q_x q_w} \sum{ \left( x_q w_q + \frac{q_b}{q_x q_w}b_q \right) } \right) \]
+\[y_f = \sum{x_f w_f} + b_f = \sum{\frac{x_q}{q_x} \frac{w_q}{q_w}} + \frac{b_q}{q_b} = \frac{1}{q_x q_w} \left( \sum { x_q w_q + \frac{q_x q_w}{q_b}b_q } \right)\]
+\[y_q = round(q_y y_f) = round\left(\frac{q_y}{q_x q_w} \left( \sum { x_q w_q + \frac{q_x q_w}{q_b}b_q } \right) \right) \]
 Note how the bias has to be re-scaled to match the scale of the summation.
 
 ### Implementation
diff --git a/docs/algo_quantization/index.html b/docs/algo_quantization/index.html
index 238d0537f965df6389563c1fb7ca93660140cb45..4888886cf16052e5bc0c194804e99aab71a0d424 100644
--- a/docs/algo_quantization/index.html
+++ b/docs/algo_quantization/index.html
@@ -220,8 +220,8 @@ Let us denote the original floating-point tensor by <script type="math/tex">x_f<
 <script type="math/tex; mode=display">x_q = round(q_x x_f)</script>
 (The <script type="math/tex">round</script> operation is round-to-nearest-integer)  </p>
 <p>Let's see how a <strong>convolution</strong> or <strong>fully-connected (FC)</strong> layer is quantized using this method: (we denote input, output, weights and bias with  <script type="math/tex">x, y, w</script> and <script type="math/tex">b</script> respectively)
-<script type="math/tex; mode=display">y_f = \sum{x_f w_f} + b_f = \sum{\frac{x_q}{q_x} \frac{w_q}{q_w}} + \frac{b_q}{q_b} = \frac{1}{q_x q_w} \sum{ \left( x_q w_q + \frac{q_b}{q_x q_w}b_q \right) }</script>
-<script type="math/tex; mode=display">y_q = round(q_y y_f) = round\left(\frac{q_y}{q_x q_w} \sum{ \left( x_q w_q + \frac{q_b}{q_x q_w}b_q \right) } \right) </script>
+<script type="math/tex; mode=display">y_f = \sum{x_f w_f} + b_f = \sum{\frac{x_q}{q_x} \frac{w_q}{q_w}} + \frac{b_q}{q_b} = \frac{1}{q_x q_w} \left( \sum { x_q w_q + \frac{q_x q_w}{q_b}b_q } \right)</script>
+<script type="math/tex; mode=display">y_q = round(q_y y_f) = round\left(\frac{q_y}{q_x q_w} \left( \sum { x_q w_q + \frac{q_x q_w}{q_b}b_q } \right) \right) </script>
 Note how the bias has to be re-scaled to match the scale of the summation.</p>
 <h3 id="implementation">Implementation</h3>
 <p>We've implemented <strong>convolution</strong> and <strong>FC</strong> using this method.  </p>
diff --git a/docs/index.html b/docs/index.html
index 0a1d63d4969af6bf3d60a31af95364d8ea4afad4..5c5cd23814c22c168e8fa67827cbdb93cf817711 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -246,5 +246,5 @@ And of course, if we used a sparse or compressed representation, then we are red
 
 <!--
 MkDocs version : 0.17.2
-Build Date UTC : 2018-06-22 00:04:22
+Build Date UTC : 2018-07-01 07:53:34
 -->
diff --git a/docs/jupyter/index.html b/docs/jupyter/index.html
index a2d501774546ec84e00bf3c855cb4dbb94d32a3d..76374b08b89f04e95a080fbaa380c0cb29c8332a 100644
--- a/docs/jupyter/index.html
+++ b/docs/jupyter/index.html
@@ -192,20 +192,20 @@ We welcome new ideas and implementations of Jupyter.</p>
 <p>Roughly, the notebooks can be divided into three categories.</p>
 <h3 id="theory">Theory</h3>
 <ul>
-<li><a href="localhost:8888/notebooks/jupyter/L1-regularization.ipynb">jupyter/L1-regularization.ipynb</a>: Experience hands-on how L1 and L2 regularization affect the solution of a toy loss-minimization problem, to get a better grasp on the interaction between regularization and sparsity.</li>
-<li><a href="localhost:8888/notebooks/alexnet_insights.ipynb">jupyter/alexnet_insights.ipynb</a>: This notebook reviews and compares a couple of pruning sessions on Alexnet.  We compare distributions, performance, statistics and show some visualizations of the weights tensors.</li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/L1-regularization.ipynb">jupyter/L1-regularization.ipynb</a>: Experience hands-on how L1 and L2 regularization affect the solution of a toy loss-minimization problem, to get a better grasp on the interaction between regularization and sparsity.</li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/alexnet_insights.ipynb">jupyter/alexnet_insights.ipynb</a>: This notebook reviews and compares a couple of pruning sessions on Alexnet.  We compare distributions, performance, statistics and show some visualizations of the weights tensors.</li>
 </ul>
 <h3 id="preparation-for-compression">Preparation for compression</h3>
 <ul>
-<li><a href="localhost:8888/notebooks/jupyter/model_summary.ipynb">jupyter/model_summary.ipynb</a>: Begin by getting familiar with your model.  Examine the sizes and properties of layers and connections.  Study which layers are compute-bound, and which are bandwidth-bound, and decide how to prune or regularize the model.</li>
-<li><a href="localhost:8888/notebooks/jupyter/sensitivity_analysis.ipynb">jupyter/sensitivity_analysis.ipynb</a>: If you performed pruning sensitivity analysis on your model, this notebook can help you load the results and graphically study how the layers behave.</li>
-<li><a href="localhost:8888/notebooks/jupyter/interactive_lr_scheduler.ipynb">jupyter/interactive_lr_scheduler.ipynb</a>: The learning rate decay policy affects pruning results, perhaps as much as it affects training results.  Graph a few LR-decay policies to see how they behave.</li>
-<li><a href="localhost:8888/notebooks/jupyter/agp_schedule.ipynb">jupyter/jupyter/agp_schedule.ipynb</a>: If you are using the Automated Gradual Pruner, this notebook can help you tune the schedule.</li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/model_summary.ipynb">jupyter/model_summary.ipynb</a>: Begin by getting familiar with your model.  Examine the sizes and properties of layers and connections.  Study which layers are compute-bound, and which are bandwidth-bound, and decide how to prune or regularize the model.</li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/sensitivity_analysis.ipynb">jupyter/sensitivity_analysis.ipynb</a>: If you performed pruning sensitivity analysis on your model, this notebook can help you load the results and graphically study how the layers behave.</li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/interactive_lr_scheduler.ipynb">jupyter/interactive_lr_scheduler.ipynb</a>: The learning rate decay policy affects pruning results, perhaps as much as it affects training results.  Graph a few LR-decay policies to see how they behave.</li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/agp_schedule.ipynb">jupyter/jupyter/agp_schedule.ipynb</a>: If you are using the Automated Gradual Pruner, this notebook can help you tune the schedule.</li>
 </ul>
 <h3 id="reviewing-experiment-results">Reviewing experiment results</h3>
 <ul>
-<li><a href="localhost:8888/notebooks/jupyter/compare_executions.ipynb">jupyter/compare_executions.ipynb</a>: This is a simple notebook to help you graphically compare the results of executions of several experiments.</li>
-<li><a href="localhost:8888/notebooks/compression_insights.ipynb">jupyter/compression_insights.ipynb</a>: This notebook is packed with code, tables and graphs to us understand the results of a compression session.  Distiller provides <em>summaries</em>, which are Pandas dataframes, which contain statistical information about you model.  We chose to use Pandas dataframes because they can be sliced, queried, summarized and graphed with a few lines of code.  </li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/compare_executions.ipynb">jupyter/compare_executions.ipynb</a>: This is a simple notebook to help you graphically compare the results of executions of several experiments.</li>
+<li><a href="https://github.com/NervanaSystems/distiller/blob/master/jupyter/compression_insights.ipynb">jupyter/compression_insights.ipynb</a>: This notebook is packed with code, tables and graphs to us understand the results of a compression session.  Distiller provides <em>summaries</em>, which are Pandas dataframes, which contain statistical information about you model.  We chose to use Pandas dataframes because they can be sliced, queried, summarized and graphed with a few lines of code.</li>
 </ul>
               
             </div>
diff --git a/docs/search/search_index.json b/docs/search/search_index.json
index 6b9865efa3773f04f1749dcb77c2dfcf4f1c0a78..e6dba17341b51d1e00796b573a15f125f51c763f 100644
--- a/docs/search/search_index.json
+++ b/docs/search/search_index.json
@@ -362,7 +362,7 @@
         }, 
         {
             "location": "/algo_quantization/index.html", 
-            "text": "Quantization Algorithms\n\n\nThe following quantization methods are currently implemented in Distiller:\n\n\nDoReFa\n\n\n(As proposed in \nDoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients\n)  \n\n\nIn this method, we first define the quantization function \nquantize_k\n, which takes a real value \na_f \\in [0, 1]\n and outputs a discrete-valued \na_q \\in \\left\\{ \\frac{0}{2^k-1}, \\frac{1}{2^k-1}, ... , \\frac{2^k-1}{2^k-1} \\right\\}\n, where \nk\n is the number of bits used for quantization.\n\n\n\n\na_q = quantize_k(a_f) = \\frac{1}{2^k-1} round \\left( \\left(2^k - 1 \\right) a_f \\right)\n\n\n\n\nActivations are clipped to the \n[0, 1]\n range and then quantized as follows:\n\n\n\n\nx_q = quantize_k(x_f)\n\n\n\n\nFor weights, we define the following function \nf\n, which takes an unbounded real valued input and outputs a real value in \n[0, 1]\n:\n\n\n\n\nf(w) = \\frac{tanh(w)}{2 max(|tanh(w)|)} + \\frac{1}{2} \n\n\n\n\nNow we can use \nquantize_k\n to get quantized weight values, as follows:\n\n\n\n\nw_q = 2 quantize_k \\left( f(w_f) \\right) - 1\n\n\n\n\nThis method requires training the model with quantization, as discussed \nhere\n. Use the \nDorefaQuantizer\n class to transform an existing model to a model suitable for training with quantization using DoReFa.\n\n\nNotes:\n\n\n\n\nGradients quantization as proposed in the paper is not supported yet.\n\n\nThe paper defines special handling for binary weights which isn't supported in Distiller yet.\n\n\n\n\nWRPN\n\n\n(As proposed in \nWRPN: Wide Reduced-Precision Networks\n)  \n\n\nIn this method, activations are clipped to \n[0, 1]\n and quantized as follows (\nk\n is the number of bits used for quantization):\n\n\n\n\nx_q = \\frac{1}{2^k-1} round \\left( \\left(2^k - 1 \\right) x_f \\right)\n\n\n\n\nWeights are clipped to \n[-1, 1]\n and quantized as follows:\n\n\n\n\nw_q = \\frac{1}{2^{k-1}-1} round \\left( \\left(2^{k-1} - 1 \\right)w_f \\right)\n\n\n\n\nNote that \nk-1\n bits are used to quantize weights, leaving one bit for sign.\n\n\nThis method requires training the model with quantization, as discussed \nhere\n. Use the \nWRPNQuantizer\n class to transform an existing model to a model suitable for training with quantization using WRPN.\n\n\nNotes:\n\n\n\n\nThe paper proposed widening of layers as a means to reduce accuracy loss. This isn't implemented as part of \nWRPNQuantizer\n at the moment. To experiment with this, modify your model implementation to have wider layers.\n\n\nThe paper defines special handling for binary weights which isn't supported in Distiller yet.\n\n\n\n\nSymmetric Linear Quantization\n\n\nIn this method, a float value is quantized by multiplying with a numeric constant (the \nscale factor\n), hence it is \nLinear\n. We use a signed integer to represent the quantized range, with no quantization bias (or \"offset\") used. As a result, the floating-point range considered for quantization is \nsymmetric\n with respect to zero.\n\nIn the current implementation the scale factor is chosen so that the entire range of the floating-point tensor is quantized (we do not attempt to remove outliers).\n\nLet us denote the original floating-point tensor by \nx_f\n, the quantized tensor by \nx_q\n, the scale factor by \nq_x\n and the number of bits used for quantization by \nn\n. Then, we get:\n\nq_x = \\frac{2^{n-1}-1}{\\max|x|}\n\n\nx_q = round(q_x x_f)\n\n(The \nround\n operation is round-to-nearest-integer)  \n\n\nLet's see how a \nconvolution\n or \nfully-connected (FC)\n layer is quantized using this method: (we denote input, output, weights and bias with  \nx, y, w\n and \nb\n respectively)\n\ny_f = \\sum{x_f w_f} + b_f = \\sum{\\frac{x_q}{q_x} \\frac{w_q}{q_w}} + \\frac{b_q}{q_b} = \\frac{1}{q_x q_w} \\sum{ \\left( x_q w_q + \\frac{q_b}{q_x q_w}b_q \\right) }\n\n\ny_q = round(q_y y_f) = round\\left(\\frac{q_y}{q_x q_w} \\sum{ \\left( x_q w_q + \\frac{q_b}{q_x q_w}b_q \\right) } \\right) \n\nNote how the bias has to be re-scaled to match the scale of the summation.\n\n\nImplementation\n\n\nWe've implemented \nconvolution\n and \nFC\n using this method.  \n\n\n\n\nThey are implemented by wrapping the existing PyTorch layers with quantization and de-quantization operations. That is - the computation is done on floating-point tensors, but the values themselves are restricted to integer values. The wrapper is implemented in the \nRangeLinearQuantParamLayerWrapper\n class.  \n\n\nAll other layers are unaffected and are executed using their original FP32 implementation.  \n\n\nTo automatically transform an existing model to a quantized model using this method, use the \nSymmetricLinearQuantizer\n class.\n\n\nFor weights and bias the scale factor is determined once at quantization setup (\"offline\"), and for activations it is determined dynamically at runtime (\"online\").  \n\n\nImportant note:\n Currently, this method is implemented as \ninference only\n, with no back-propagation functionality. Hence, it can only be used to quantize a pre-trained FP32 model, with no re-training. As such, using it with \nn < 8\n is likely to lead to severe accuracy degradation for any non-trivial workload.", 
+            "text": "Quantization Algorithms\n\n\nThe following quantization methods are currently implemented in Distiller:\n\n\nDoReFa\n\n\n(As proposed in \nDoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients\n)  \n\n\nIn this method, we first define the quantization function \nquantize_k\n, which takes a real value \na_f \\in [0, 1]\n and outputs a discrete-valued \na_q \\in \\left\\{ \\frac{0}{2^k-1}, \\frac{1}{2^k-1}, ... , \\frac{2^k-1}{2^k-1} \\right\\}\n, where \nk\n is the number of bits used for quantization.\n\n\n\n\na_q = quantize_k(a_f) = \\frac{1}{2^k-1} round \\left( \\left(2^k - 1 \\right) a_f \\right)\n\n\n\n\nActivations are clipped to the \n[0, 1]\n range and then quantized as follows:\n\n\n\n\nx_q = quantize_k(x_f)\n\n\n\n\nFor weights, we define the following function \nf\n, which takes an unbounded real valued input and outputs a real value in \n[0, 1]\n:\n\n\n\n\nf(w) = \\frac{tanh(w)}{2 max(|tanh(w)|)} + \\frac{1}{2} \n\n\n\n\nNow we can use \nquantize_k\n to get quantized weight values, as follows:\n\n\n\n\nw_q = 2 quantize_k \\left( f(w_f) \\right) - 1\n\n\n\n\nThis method requires training the model with quantization, as discussed \nhere\n. Use the \nDorefaQuantizer\n class to transform an existing model to a model suitable for training with quantization using DoReFa.\n\n\nNotes:\n\n\n\n\nGradients quantization as proposed in the paper is not supported yet.\n\n\nThe paper defines special handling for binary weights which isn't supported in Distiller yet.\n\n\n\n\nWRPN\n\n\n(As proposed in \nWRPN: Wide Reduced-Precision Networks\n)  \n\n\nIn this method, activations are clipped to \n[0, 1]\n and quantized as follows (\nk\n is the number of bits used for quantization):\n\n\n\n\nx_q = \\frac{1}{2^k-1} round \\left( \\left(2^k - 1 \\right) x_f \\right)\n\n\n\n\nWeights are clipped to \n[-1, 1]\n and quantized as follows:\n\n\n\n\nw_q = \\frac{1}{2^{k-1}-1} round \\left( \\left(2^{k-1} - 1 \\right)w_f \\right)\n\n\n\n\nNote that \nk-1\n bits are used to quantize weights, leaving one bit for sign.\n\n\nThis method requires training the model with quantization, as discussed \nhere\n. Use the \nWRPNQuantizer\n class to transform an existing model to a model suitable for training with quantization using WRPN.\n\n\nNotes:\n\n\n\n\nThe paper proposed widening of layers as a means to reduce accuracy loss. This isn't implemented as part of \nWRPNQuantizer\n at the moment. To experiment with this, modify your model implementation to have wider layers.\n\n\nThe paper defines special handling for binary weights which isn't supported in Distiller yet.\n\n\n\n\nSymmetric Linear Quantization\n\n\nIn this method, a float value is quantized by multiplying with a numeric constant (the \nscale factor\n), hence it is \nLinear\n. We use a signed integer to represent the quantized range, with no quantization bias (or \"offset\") used. As a result, the floating-point range considered for quantization is \nsymmetric\n with respect to zero.\n\nIn the current implementation the scale factor is chosen so that the entire range of the floating-point tensor is quantized (we do not attempt to remove outliers).\n\nLet us denote the original floating-point tensor by \nx_f\n, the quantized tensor by \nx_q\n, the scale factor by \nq_x\n and the number of bits used for quantization by \nn\n. Then, we get:\n\nq_x = \\frac{2^{n-1}-1}{\\max|x|}\n\n\nx_q = round(q_x x_f)\n\n(The \nround\n operation is round-to-nearest-integer)  \n\n\nLet's see how a \nconvolution\n or \nfully-connected (FC)\n layer is quantized using this method: (we denote input, output, weights and bias with  \nx, y, w\n and \nb\n respectively)\n\ny_f = \\sum{x_f w_f} + b_f = \\sum{\\frac{x_q}{q_x} \\frac{w_q}{q_w}} + \\frac{b_q}{q_b} = \\frac{1}{q_x q_w} \\left( \\sum { x_q w_q + \\frac{q_x q_w}{q_b}b_q } \\right)\n\n\ny_q = round(q_y y_f) = round\\left(\\frac{q_y}{q_x q_w} \\left( \\sum { x_q w_q + \\frac{q_x q_w}{q_b}b_q } \\right) \\right) \n\nNote how the bias has to be re-scaled to match the scale of the summation.\n\n\nImplementation\n\n\nWe've implemented \nconvolution\n and \nFC\n using this method.  \n\n\n\n\nThey are implemented by wrapping the existing PyTorch layers with quantization and de-quantization operations. That is - the computation is done on floating-point tensors, but the values themselves are restricted to integer values. The wrapper is implemented in the \nRangeLinearQuantParamLayerWrapper\n class.  \n\n\nAll other layers are unaffected and are executed using their original FP32 implementation.  \n\n\nTo automatically transform an existing model to a quantized model using this method, use the \nSymmetricLinearQuantizer\n class.\n\n\nFor weights and bias the scale factor is determined once at quantization setup (\"offline\"), and for activations it is determined dynamically at runtime (\"online\").  \n\n\nImportant note:\n Currently, this method is implemented as \ninference only\n, with no back-propagation functionality. Hence, it can only be used to quantize a pre-trained FP32 model, with no re-training. As such, using it with \nn < 8\n is likely to lead to severe accuracy degradation for any non-trivial workload.", 
             "title": "Quantization"
         }, 
         {
@@ -392,7 +392,7 @@
         }, 
         {
             "location": "/algo_quantization/index.html#symmetric-linear-quantization", 
-            "text": "In this method, a float value is quantized by multiplying with a numeric constant (the  scale factor ), hence it is  Linear . We use a signed integer to represent the quantized range, with no quantization bias (or \"offset\") used. As a result, the floating-point range considered for quantization is  symmetric  with respect to zero. \nIn the current implementation the scale factor is chosen so that the entire range of the floating-point tensor is quantized (we do not attempt to remove outliers). \nLet us denote the original floating-point tensor by  x_f , the quantized tensor by  x_q , the scale factor by  q_x  and the number of bits used for quantization by  n . Then, we get: q_x = \\frac{2^{n-1}-1}{\\max|x|}  x_q = round(q_x x_f) \n(The  round  operation is round-to-nearest-integer)    Let's see how a  convolution  or  fully-connected (FC)  layer is quantized using this method: (we denote input, output, weights and bias with   x, y, w  and  b  respectively) y_f = \\sum{x_f w_f} + b_f = \\sum{\\frac{x_q}{q_x} \\frac{w_q}{q_w}} + \\frac{b_q}{q_b} = \\frac{1}{q_x q_w} \\sum{ \\left( x_q w_q + \\frac{q_b}{q_x q_w}b_q \\right) }  y_q = round(q_y y_f) = round\\left(\\frac{q_y}{q_x q_w} \\sum{ \\left( x_q w_q + \\frac{q_b}{q_x q_w}b_q \\right) } \\right)  \nNote how the bias has to be re-scaled to match the scale of the summation.", 
+            "text": "In this method, a float value is quantized by multiplying with a numeric constant (the  scale factor ), hence it is  Linear . We use a signed integer to represent the quantized range, with no quantization bias (or \"offset\") used. As a result, the floating-point range considered for quantization is  symmetric  with respect to zero. \nIn the current implementation the scale factor is chosen so that the entire range of the floating-point tensor is quantized (we do not attempt to remove outliers). \nLet us denote the original floating-point tensor by  x_f , the quantized tensor by  x_q , the scale factor by  q_x  and the number of bits used for quantization by  n . Then, we get: q_x = \\frac{2^{n-1}-1}{\\max|x|}  x_q = round(q_x x_f) \n(The  round  operation is round-to-nearest-integer)    Let's see how a  convolution  or  fully-connected (FC)  layer is quantized using this method: (we denote input, output, weights and bias with   x, y, w  and  b  respectively) y_f = \\sum{x_f w_f} + b_f = \\sum{\\frac{x_q}{q_x} \\frac{w_q}{q_w}} + \\frac{b_q}{q_b} = \\frac{1}{q_x q_w} \\left( \\sum { x_q w_q + \\frac{q_x q_w}{q_b}b_q } \\right)  y_q = round(q_y y_f) = round\\left(\\frac{q_y}{q_x q_w} \\left( \\sum { x_q w_q + \\frac{q_x q_w}{q_b}b_q } \\right) \\right)  \nNote how the bias has to be re-scaled to match the scale of the summation.", 
             "title": "Symmetric Linear Quantization"
         }, 
         {
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 2b0ecb38ecdc225b313d58ea361475e38e6a07a6..ea7fb0bb656c30e94cede047a663e86718085c9c 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -4,7 +4,7 @@
     
     <url>
      <loc>/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
     
@@ -12,7 +12,7 @@
     
     <url>
      <loc>/install/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
     
@@ -20,7 +20,7 @@
     
     <url>
      <loc>/usage/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
     
@@ -28,7 +28,7 @@
     
     <url>
      <loc>/schedule/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
     
@@ -37,19 +37,19 @@
         
     <url>
      <loc>/pruning/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
         
     <url>
      <loc>/regularization/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
         
     <url>
      <loc>/quantization/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
         
@@ -59,13 +59,13 @@
         
     <url>
      <loc>/algo_pruning/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
         
     <url>
      <loc>/algo_quantization/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
         
@@ -74,7 +74,7 @@
     
     <url>
      <loc>/model_zoo/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
     
@@ -82,7 +82,7 @@
     
     <url>
      <loc>/jupyter/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>
     
@@ -90,7 +90,7 @@
     
     <url>
      <loc>/design/index.html</loc>
-     <lastmod>2018-06-22</lastmod>
+     <lastmod>2018-07-01</lastmod>
      <changefreq>daily</changefreq>
     </url>