diff --git a/docs-src/docs/algo_quantization.md b/docs-src/docs/algo_quantization.md index 2c22aaffbfa20e1da62d8d1ce15e778f62122911..aad12f40ef61878ad0fc0fe81da997708c6dd5a4 100644 --- a/docs-src/docs/algo_quantization.md +++ b/docs-src/docs/algo_quantization.md @@ -8,7 +8,7 @@ For any of the methods below that require quantization-aware training, please se Let's break down the terminology we use here: - **Linear:** Means a float value is quantized by multiplying with a numeric constant (the **scale factor**). -- **Range-Based:**: Means that in order to calculate the scale factor, we look at the actual range of the tensor's values. In the most naive implementation, we use the actual min/max values of the tensor. Alternatively, we use some derivation based on the tensor's range / distribution to come up with a narrower min/max range, in order to remove possible outliers. This is in contrast to the other methods described here, which we could call **clipping-based**, as they impose an explicit clipping function on the tensors (using either a hard-coded value or a learned value). +- **Range-Based:** Means that in order to calculate the scale factor, we look at the actual range of the tensor's values. In the most naive implementation, we use the actual min/max values of the tensor. Alternatively, we use some derivation based on the tensor's range / distribution to come up with a narrower min/max range, in order to remove possible outliers. This is in contrast to the other methods described here, which we could call **clipping-based**, as they impose an explicit clipping function on the tensors (using either a hard-coded value or a learned value). ### Asymmetric vs. Symmetric @@ -154,4 +154,4 @@ This method requires training the model with quantization-aware training, as dis ### Notes: - The paper proposed widening of layers as a means to reduce accuracy loss. This isn't implemented as part of `WRPNQuantizer` at the moment. To experiment with this, modify your model implementation to have wider layers. -- The paper defines special handling for binary weights which isn't supported in Distiller yet. \ No newline at end of file +- The paper defines special handling for binary weights which isn't supported in Distiller yet.