Rewrote FAQ doc

1b418b04 · Yifan Zhao · 9933bc26 · 1b418b04
Commit 1b418b04 authored 4 years ago by Yifan Zhao
--- a/hpvm/docs/faqs.rst
+++ b/hpvm/docs/faqs.rst
 Frequently Asked Questions
 ==========================

+#. **Is Python3.6 a strict requirement for installation?**

-**Q1. Is Python3.6 a strict requirement for installation**?
+   Yes, our HPVM python packages require python version = 3.6.
+   If you don't have a Python3.6 on your system, we encourage using the provided ``env.yaml`` conda environment.

-Yes, our HPVM python packages require python version = 3.6. If you don't have a Python3.6 on your system, we encourage using the provided `env.yaml` conda environment.
+#. **What is a "target device" or the "profiling stage"?
+   Why does the tutorial seems to suggest building HPVM** :ref:`on a second device<target-profiling>`?

-**Q2. What to do when running into out of memory errors**?
+   HPVM is capable of *predictive approximation tuning* which, due to its computational cost,
+   is often done on a powerful computer, like a server,
+   but the selected approximations are usually used to speedup your application
+   on a less powerful device (the *target device*, such as an edge device).
+   The profiling stage (using `hpvm-profiler`) is necessary so that the real speedup of approximations are measured,
+   and this is also done on the target device.
+   See our `ApproxTuner paper <https://dl.acm.org/doi/10.1145/3437801.3446108>`_ for more details on this.

-Users can configure the batch size through Keras/PyTorch frontends. Users are encouraged to reduce batch size when encountering out of memory errors.
+   Currently, HPVM must be built on both the server and the target device for this purpose.
+   We will achieve better server/edge separation of HPVM in the following releases,
+   so that only the necessary part of code are built on each device.

-**Q3. Should I expect speedups with approximations on my hardware system**?
+#. **What is the expcted speedups with approximations on my target device?**

-The approximation implementations in HPVM are currently optimized for the Nvidia Tegra Tx2 edge device. The routines are not expected to provide speedups across other hardware devices - though systems with similar hardware specifications may exhibit similar performance. We are working on providing speedups across a wider range of devices.
+   The approximation implementations in HPVM are currently only optimized for
+   `Nvidia Tegra TX2 <https://developer.nvidia.com/embedded/jetson-tx2>`_.
+   The routines may not provide the same speedup on other hardware devices --
+   though systems with similar hardware specifications may exhibit similar performance.
+   We are working on providing speedups across a wider range of devices.

-**Q4. How many autotuning iterations should I use with `predtuner` package in HPVM**?
+#. **Why doesn't the conda environment / Python packages installation work on Jetson boards?**

-The number of tuning iterations required to achieve good results varies across benchmarks. Users must tune this on a per-benchmark basis. For the included 10 CNNs, we recommmend using atleast 10K iterations.
+   You may be seeing errors like 

-**Q5. How can I extend HPVM to include new custom approximations**?
+   .. code-block:: text

-Users can update the `hpvm-tensor-rt` in HPVM to include new custom approximations that are targeted by the compiler. 
+      ResolvePackageNotFound:
+        pytorch==1.6.0

-Alternatively developers can update the HPVM backends to compile to external libraries with support for custom approximations. The HPVM backends are documented in detail in [TODO : Add link to Backends Doc]
+   or other errors indicating ``pytorch``, ``torchvision`` or other packages cannot be installed,
+   because these packages are not prebuilt for ARM CPU on `PyPI <https://pypi.org/>`_.

-The `predtuner` in HPVM is flexible to include more approximation knobs. [TODO: Yifan should add more details on how to add more knobs]
+   The simplest solution is not to install HPVM frontends and autotuner;
+   see :ref:`this <skip-pypkg>` for how to do so.
+   The job of these packages are best left to a server machine.

-**Q6. Does this release support combining HPVM tensor and non-tensor operations in a single program**?
+#. **What to do when running into "CUDA out of memory" errors?**

-Currently we do not support tensor and non-tensor code in the same application. We will support this feature in the next release.
+   When the Keras/PyTorch frontends generates code, they accept a "batch size" parameter,
+   which decides the batch size at which the DNN inference runs.
+   You may need to reduce batch size when encountering out of memory errors.

-**Q7. Does this release support object detection models?** 
+#. **How many autotuning iterations should I use with PredTuner package in HPVM?**

-Currrently, HPVM doesn't support object detection models. Support will be added in future releases.
+   The number of tuning iterations required to achieve good results varies across benchmarks
+   and should be figured out on a per-benchmark basis.
+   For the included 10 CNNs, we recommmend using at least 10K iterations.

+#. **Does this release support combining HPVM tensor and non-tensor operations in a single program?**

+   Currently we do not support tensor and non-tensor code in the same application.
+   We will support this feature in the next release.

+#. **Does this release support object detection models?** 

+   Currrently, HPVM doesn't support object detection models,
+   due to the limited number of operators supported in the tensor library `hpvm-tensor-rt`.
+   We will add support for more operators in the next release.