Small improvements to the doc

f6e1bf5c · Yifan Zhao · 1d52149a · f6e1bf5c · 1d52149a · f6e1bf5c
Commit f6e1bf5c authored 4 years ago by Yifan Zhao
--- a/hpvm/docs/build-hpvm.rst
+++ b/hpvm/docs/build-hpvm.rst
@@ -12,9 +12,13 @@ Dependencies

   * GNU Make (>=3.79) or Ninja (>=1.10)

-   * Python (>=3.6) with pip (>=20)
+   * Python (==3.6) with pip (>=20)

      * Python must be strictly 3.6 (any subversion from 3.6.0 to 3.6.13).
+        This is needed by some Python packages in HPVM.
+      
+      * If you choose to not install these packages, then any Python >= 3.6 will work.
+        See :ref:`how to skip installing Python packages in the installer <skip-pypkg>`.

 * OpenCL (>=1.0.0) is required for compiling HPVM-C code on GPU; otherwise, only CPU is available.

@@ -28,10 +32,10 @@ Dependencies
   * OpenMP (>= 4.0)

      * GCC comes with OpenMP support; OpenMP-4.0 is supported by GCC-4.9 onward.
-        see `here <https://gcc.gnu.org/wiki/openmp>`_ for the OpenMP version supported by each GCC version.
+        see `here <https://gcc.gnu.org/wiki/openmp>`__ for the OpenMP version supported by each GCC version.
 
   * In addition, each version of CUDA-nvcc requires GCC to be not newer than a certain version.
-     See `here <https://gist.github.com/ax3l/9489132>`_ for the support matrix.
+     See `here <https://gist.github.com/ax3l/9489132>`__ for the support matrix.


 Python Environment
@@ -100,7 +104,7 @@ the directory ``hpvm/projects/predtuner`` should be empty,
 which can be fixed with ``git submodule update --recursive --init``.

 HPVM needs to be able to find CUDA.
-If CUDA is installed in your system's `$PATH` (e.g. if it was installed at the default location),
+If CUDA is installed in your system's ``$PATH`` (e.g. if it was installed at the default location),
 HPVM can find CUDA automatically.

 Use HPVM installer script to download extra components, configure and build HPVM:
@@ -159,6 +163,26 @@ The HPVM installer performs the following tasks:

  * While running tests is recommended, it is not turned on by default as it is very time-consuming.

+.. _skip-pypkg:
+
+Skipping Python Package installation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you are installing HPVM on a "target" device which is just used for
+:ref:`profiling <target-profiling>`,
+you may not need to install the frontend and the tuner packages.
+These packages also have Python version requirement and package dependencies
+that may be hard to meet on some devices, especially edge computing devices with ARM CPUs.
+
+You can instead skip the installation by either passing ``--no-pypkg`` flag to
+the installer, or answering yes ("y") when it prompt the following:
+
+.. code-block:: text
+
+   Install HPVM Python Packages (recommended)? [y/n]
+
+In this case, any Python >= 3.6 will work.
+
 TroubleShooting
 ^^^^^^^^^^^^^^^


--- a/hpvm/docs/developerdocs/compilation-process.rst
+++ b/hpvm/docs/developerdocs/compilation-process.rst
-.. _hpvm-comp-process:
-
-HPVM Compilation Process
-========================
-
-Compilation of an HPVM program involves the following steps:
-
-
-#. ``clang`` takes an HPVM-C/C++ program (e.g. ``main.c``) and produces an LLVM IR (``main.ll``) file that contains the HPVM-C function calls. The declarations of these functions are defined in ``test/benchmark/include/hpvm.h``, which must be included in the program.
-#. ``opt`` takes (``main.ll``) and invoke the GenHPVM pass on it, which converts the HPVM-C function calls to HPVM intrinsics. This generates the HPVM textual representation (``main.hpvm.ll``).
-#. ``opt`` takes the HPVM textual representation (``main.hpvm.ll``) and invokes the following passes in sequence: 
-
-   * BuildDFG: Converts the textual representation to the internal HPVM representation.
-   * LocalMem and DFG2LLVM_OpenCL: Invoked only when GPU target is selected. Generates the kernel module (``main.kernels.ll``) and the portion of the host code that invokes the kernel into the host module (``main.host.ll``).
-   * DFG2LLVM_CPU: Generates either all, or the remainder of the host module (``main.host.ll``) depending on the chosen target.
-   * ClearDFG: Deletes the internal HPVM representation from memory.
-
-#. ``clang`` is used to to compile any remaining project files that would be later linked with the host module.
-#. ``llvm-link`` takes the host module and all the other generate ``ll`` files, and links them with the HPVM runtime module (``hpvm-rt.bc``), to generate the linked host module (``main.host.linked.ll``). 
-#. Generate the executable code from the generated ``ll`` files for all parts of the program:
-
-   * GPU target: ``llvm-cbe`` takes the kernel module (``main.kernels.ll``) and generates an OpenCL representation of the kernels that will be invoked by the host.
-   * CPU target: ``clang`` takes the linked  host module (``main.host.linked.ll``) and generates the CPU binary.
--- a/hpvm/docs/developerdocs/index.rst
+++ b/hpvm/docs/developerdocs/index.rst
@@ -5,8 +5,8 @@ Developer Documents
   :maxdepth: 1

   approximation-implementation
+   backend-passes
   cnn-models
-   compilation-process
   configuration-format
   dynamic-approximation
   port-to-hpvm-c
--- a/hpvm/docs/getting-started.rst
+++ b/hpvm/docs/getting-started.rst
@@ -190,8 +190,10 @@ After the tuning finishes, the tuner will
 It is also possible to save the configuration in other formats
 (see the `predtuner documentation <https://predtuner.readthedocs.io/en/latest/index.html>`_).

-Profiling the Configurations
----------------------------
+.. _target-profiling:
+
+Profiling the Configurations on Target Device
+---------------------------------------------

 We will use `hpvm_profiler` (a Python package) for profiling the ``./hpvm_confs.txt``
 we obtained in the tuning step.
@@ -203,6 +205,8 @@ we obtained in the tuning step.
  where the performance gain is desired.
  As the compiled binary is usually not compatible across architectures,
  you need to install HPVM on the edge device and recompile the model.
+  You may also want to :ref:`skip Python packages in the installation <skip-pypkg>`
+  to reduce some constraints on Python version and Python packages.

 * **Also note** that currently,
  the approximation implementations in the tensor runtime are tuned for Jetson TX2,

--- a/hpvm/docs/index.rst
+++ b/hpvm/docs/index.rst
@@ -52,11 +52,11 @@ Please refer to :doc:`getting-started` for how to build and use HPVM.

   getting-started
   build-hpvm
-   FAQs<faqs>
   components/index
   specifications/index
   developerdocs/index
   gallery
+   FAQs<faqs>

 Indices and tables
 ------------------

--- a/hpvm/test/README.rst
+++ b/hpvm/test/README.rst
@@ -99,7 +99,7 @@ For each benchmark ``${bench_name}``, the binary is generated at
 Alternatively, it's possible to build just 1 DNN benchmark.
 The output of CMake shows a list of these benchmarks as target names, starting with

-..
+.. code-block:: text

   List of test dnn benchmarks: alexnet2_cifar10;alexnet2_cifar10...