diff --git a/hpvm/docs/build-hpvm.rst b/hpvm/docs/build-hpvm.rst
index 0701784a4e5a3e6180d072dfbc545ddc1665e0ed..4e31ad8c7f4e7f43f998c17493d80f9c9cfeecd1 100644
--- a/hpvm/docs/build-hpvm.rst
+++ b/hpvm/docs/build-hpvm.rst
@@ -12,9 +12,13 @@ Dependencies
 
    * GNU Make (>=3.79) or Ninja (>=1.10)
 
-   * Python (>=3.6) with pip (>=20)
+   * Python (==3.6) with pip (>=20)
 
       * Python must be strictly 3.6 (any subversion from 3.6.0 to 3.6.13).
+        This is needed by some Python packages in HPVM.
+      
+      * If you choose to not install these packages, then any Python >= 3.6 will work.
+        See :ref:`how to skip installing Python packages in the installer <skip-pypkg>`.
 
 * OpenCL (>=1.0.0) is required for compiling HPVM-C code on GPU; otherwise, only CPU is available.
 
@@ -28,10 +32,10 @@ Dependencies
    * OpenMP (>= 4.0)
 
       * GCC comes with OpenMP support; OpenMP-4.0 is supported by GCC-4.9 onward.
-        see `here <https://gcc.gnu.org/wiki/openmp>`_ for the OpenMP version supported by each GCC version.
+        see `here <https://gcc.gnu.org/wiki/openmp>`__ for the OpenMP version supported by each GCC version.
  
    * In addition, each version of CUDA-nvcc requires GCC to be not newer than a certain version.
-     See `here <https://gist.github.com/ax3l/9489132>`_ for the support matrix.
+     See `here <https://gist.github.com/ax3l/9489132>`__ for the support matrix.
 
 
 Python Environment
@@ -100,7 +104,7 @@ the directory ``hpvm/projects/predtuner`` should be empty,
 which can be fixed with ``git submodule update --recursive --init``.
 
 HPVM needs to be able to find CUDA.
-If CUDA is installed in your system's `$PATH` (e.g. if it was installed at the default location),
+If CUDA is installed in your system's ``$PATH`` (e.g. if it was installed at the default location),
 HPVM can find CUDA automatically.
 
 Use HPVM installer script to download extra components, configure and build HPVM:
@@ -159,6 +163,26 @@ The HPVM installer performs the following tasks:
 
   * While running tests is recommended, it is not turned on by default as it is very time-consuming.
 
+.. _skip-pypkg:
+
+Skipping Python Package installation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you are installing HPVM on a "target" device which is just used for
+:ref:`profiling <target-profiling>`,
+you may not need to install the frontend and the tuner packages.
+These packages also have Python version requirement and package dependencies
+that may be hard to meet on some devices, especially edge computing devices with ARM CPUs.
+
+You can instead skip the installation by either passing ``--no-pypkg`` flag to
+the installer, or answering yes ("y") when it prompt the following:
+
+.. code-block:: text
+
+   Install HPVM Python Packages (recommended)? [y/n]
+
+In this case, any Python >= 3.6 will work.
+
 TroubleShooting
 ^^^^^^^^^^^^^^^
 
diff --git a/hpvm/docs/developerdocs/compilation-process.rst b/hpvm/docs/developerdocs/compilation-process.rst
deleted file mode 100644
index 1115de935f4adcb06e1946bf97983641877fa5ee..0000000000000000000000000000000000000000
--- a/hpvm/docs/developerdocs/compilation-process.rst
+++ /dev/null
@@ -1,23 +0,0 @@
-.. _hpvm-comp-process:
-
-HPVM Compilation Process
-========================
-
-Compilation of an HPVM program involves the following steps:
-
-
-#. ``clang`` takes an HPVM-C/C++ program (e.g. ``main.c``) and produces an LLVM IR (``main.ll``) file that contains the HPVM-C function calls. The declarations of these functions are defined in ``test/benchmark/include/hpvm.h``, which must be included in the program.
-#. ``opt`` takes (``main.ll``) and invoke the GenHPVM pass on it, which converts the HPVM-C function calls to HPVM intrinsics. This generates the HPVM textual representation (``main.hpvm.ll``).
-#. ``opt`` takes the HPVM textual representation (``main.hpvm.ll``) and invokes the following passes in sequence: 
-
-   * BuildDFG: Converts the textual representation to the internal HPVM representation.
-   * LocalMem and DFG2LLVM_OpenCL: Invoked only when GPU target is selected. Generates the kernel module (``main.kernels.ll``) and the portion of the host code that invokes the kernel into the host module (``main.host.ll``).
-   * DFG2LLVM_CPU: Generates either all, or the remainder of the host module (``main.host.ll``) depending on the chosen target.
-   * ClearDFG: Deletes the internal HPVM representation from memory.
-
-#. ``clang`` is used to to compile any remaining project files that would be later linked with the host module.
-#. ``llvm-link`` takes the host module and all the other generate ``ll`` files, and links them with the HPVM runtime module (``hpvm-rt.bc``), to generate the linked host module (``main.host.linked.ll``). 
-#. Generate the executable code from the generated ``ll`` files for all parts of the program:
-
-   * GPU target: ``llvm-cbe`` takes the kernel module (``main.kernels.ll``) and generates an OpenCL representation of the kernels that will be invoked by the host.
-   * CPU target: ``clang`` takes the linked  host module (``main.host.linked.ll``) and generates the CPU binary.
diff --git a/hpvm/docs/developerdocs/index.rst b/hpvm/docs/developerdocs/index.rst
index c9889681f82a353b5806f4af51d81ea7b50460b9..4e5508909bcd9e4e790752b60b763f4efd0e845f 100644
--- a/hpvm/docs/developerdocs/index.rst
+++ b/hpvm/docs/developerdocs/index.rst
@@ -5,8 +5,8 @@ Developer Documents
    :maxdepth: 1
 
    approximation-implementation
+   backend-passes
    cnn-models
-   compilation-process
    configuration-format
    dynamic-approximation
    port-to-hpvm-c
diff --git a/hpvm/docs/getting-started.rst b/hpvm/docs/getting-started.rst
index 3fd229086dbb1cdb006b11530a5a68155ba28fd1..396e407439c2cf1b7648ca665948775dd77b2582 100644
--- a/hpvm/docs/getting-started.rst
+++ b/hpvm/docs/getting-started.rst
@@ -190,8 +190,10 @@ After the tuning finishes, the tuner will
 It is also possible to save the configuration in other formats
 (see the `predtuner documentation <https://predtuner.readthedocs.io/en/latest/index.html>`_).
 
-Profiling the Configurations
-----------------------------
+.. _target-profiling:
+
+Profiling the Configurations on Target Device
+---------------------------------------------
 
 We will use `hpvm_profiler` (a Python package) for profiling the ``./hpvm_confs.txt``
 we obtained in the tuning step.
@@ -203,6 +205,8 @@ we obtained in the tuning step.
   where the performance gain is desired.
   As the compiled binary is usually not compatible across architectures,
   you need to install HPVM on the edge device and recompile the model.
+  You may also want to :ref:`skip Python packages in the installation <skip-pypkg>`
+  to reduce some constraints on Python version and Python packages.
 
 * **Also note** that currently,
   the approximation implementations in the tensor runtime are tuned for Jetson TX2,
diff --git a/hpvm/docs/index.rst b/hpvm/docs/index.rst
index cd6323bcdd969b3b5f5c1343d9d93b4154301f4c..92b32cdfb06c7cbf9d18adf71ce5fd752453eaf7 100644
--- a/hpvm/docs/index.rst
+++ b/hpvm/docs/index.rst
@@ -52,11 +52,11 @@ Please refer to :doc:`getting-started` for how to build and use HPVM.
 
    getting-started
    build-hpvm
-   FAQs<faqs>
    components/index
    specifications/index
    developerdocs/index
    gallery
+   FAQs<faqs>
 
 Indices and tables
 ------------------
diff --git a/hpvm/test/README.rst b/hpvm/test/README.rst
index efd37dccf16468cf3cc2e224ba8503ed496032c8..796699c068b512b1029ffd21aef2a7d1f27d25fe 100644
--- a/hpvm/test/README.rst
+++ b/hpvm/test/README.rst
@@ -99,7 +99,7 @@ For each benchmark ``${bench_name}``, the binary is generated at
 Alternatively, it's possible to build just 1 DNN benchmark.
 The output of CMake shows a list of these benchmarks as target names, starting with
 
-..
+.. code-block:: text
 
    List of test dnn benchmarks: alexnet2_cifar10;alexnet2_cifar10...