Skip to content
Snippets Groups Projects
Commit 8b49370a authored by Hashim Sharif's avatar Hashim Sharif Committed by Yifan Zhao
Browse files

Merging in remote changes

parents 62b63ab7 d907e910
No related branches found
No related tags found
No related merge requests found
Showing
with 76 additions and 48 deletions
hpvm/docs/_static/alexnet2_cifar10.png

72.3 KiB

hpvm/docs/_static/alexnet_cifar10.png

76.8 KiB

hpvm/docs/_static/alexnet_imagenet.png

69.8 KiB

hpvm/docs/_static/lenet_mnist.png

60.3 KiB

hpvm/docs/_static/mobilenet_cifar10.png

66 KiB

hpvm/docs/_static/resnet18_cifar10.png

71.2 KiB

hpvm/docs/_static/resnet50_imagenet.png

73.5 KiB

hpvm/docs/_static/vgg16_cifar10.png

77.2 KiB

hpvm/docs/_static/vgg16_cifar100.png

74.6 KiB

......@@ -7,7 +7,7 @@ HPVM consists of a few relatively independent key components.
* HPVM code generator: a few ``opt`` passes that lowers HPVM IR to LLVM IR,
which is then compiled into object code and binary.
:doc:`Compilation process of HPVM </references/hpvm-specification>`
:doc:`Compilation process of HPVM </specifications/hpvm-spec>`
shows how these 2 components work together.
In addition, there are:
......
......@@ -98,7 +98,7 @@ html_theme_options = {
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = []
html_static_path = ["_static"]
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
......
......@@ -6,4 +6,6 @@ Developer Documents
approximation-implementation
cnn-models
compilation-process
configuration-format
port-to-hpvm-c
Porting a Program from C to HPVM-C
==================================
The following represents the required steps to port a regular C program into an HPVM program with HPVM-C. These steps are described at a high level; for more detail, please see `hpvm-cava </hpvm/test/benchmarks/hpvm-cava>`_ provided in `benchmarks </hpvm/test/benchmarks>`_.
* Separate the computation that will become a kernel into its own (leaf node) function and add the attributes and target hint.
* Create a level 1 wrapper node function that will describe the thread-level parallelism (for the GPU). The node will:
* Use the ``createNode[ND]()`` method to create a kernel node and specify how many threads will execute it.
* Bind its arguments to the kernel arguments.
* If desired, create a level 2 wrapper node function which will describe the threadblock-level parallalism (for the GPU). This node will:
* Use the ``createNode[ND]()`` method to create a level 1 wrapper node and specify how many threadblocks will execute it.
* Bind its arguments to its child node's arguments.
* A root node function that creates all the top-level wrapper nodes, binds their arguments, and connects their edges.
* Each root node represents a DFG.
* All the above node functions have the combined arguments of all the kernels that are nested at each level.
* The host code will have to include the following:
* Initialize the HPVM runtime using the ``init()`` method.
* Create an argument struct for each DFG and assign its member variables.
* Add all the memory that is required by the kernel into the memory tracker.
* Launch the DFG by calling the ``launch()`` method on the root node function, and passing the corresponding argument struct.
* Wait for the DFG to complete execution.
* Read out any generated memory using the ``request_mem()`` method.
* Remove all the tracked memory from the memory tracker.
Gallery
=======
This gallery contains example tradeoff curves for the 10 DNN benchmarks in HPVM.
The performances shown are measured on Jetson TX2;
these numbers are close to what is shown in `ApproxTuner <https://dl.acm.org/doi/10.1145/3437801.3446108>`_,
with small differences due to variation in autotuning and profiling.
.. list-table:: Tradeoff curves of 9 benchmark DNN networks.
:widths: 30 30
:header-rows: 0
* - .. image:: _static/lenet_mnist.png
:target: _static/lenet_mnist.png
- .. image:: _static/alexnet2_cifar10.png
:target: _static/alexnet2_cifar10.png
* - .. image:: _static/alexnet_cifar10.png
:target: _static/alexnet_cifar10.png
- .. image:: _static/alexnet_imagenet.png
:target: _static/alexnet_imagenet.png
* - .. image:: _static/vgg16_cifar10.png
:target: _static/vgg16_cifar10.png
- .. image:: _static/vgg16_cifar100.png
:target: _static/vgg16_cifar100.png
* - .. image:: _static/resnet18_cifar10.png
:target: _static/resnet18_cifar10.png
- .. image:: _static/resnet50_imagenet.png
:target: _static/resnet50_imagenet.png
* - .. image:: _static/mobilenet_cifar10.png
:target: _static/mobilenet_cifar10.png
-
......@@ -222,7 +222,7 @@ while ``configs_profiled.png`` shows the final performance-accuracy tradeoff cur
An example of ``configs_profiled.png`` looks like this (proportion of your image may be different):
.. image:: tradeoff-curves/alexnet2_cifar10.png
.. image:: _static/alexnet2_cifar10.png
-----------------------
......
......@@ -52,8 +52,9 @@ Documentation
getting-started
tests
components/index
references/index
specifications/index
developerdocs/index
gallery
Indices and tables
------------------
......
......@@ -19,12 +19,19 @@ The following components are required to be installed on your machine to build H
Python must be strictly 3.6 (any subversion from 3.6.0 to 3.6.13).
Alternatively, if you use Anaconda for package management,
we provide a conda environment file that covers all Python and package requirements:
we provide a conda environment file that covers all Python and package requirements
(``hpvm/env.yaml`` can be found in the repository):
.. code-block:: bash
conda env create -n hpvm -f hpvm/env.yaml
This creates the conda environment ``hpvm``.
If you use this method, remember to activate the environment each time you enter a bash shell:
.. code-block:: bash
conda activate hpvm
Supported Architectures
-----------------------
......
References
============
Below are some technical details of HPVM system and the HPVM-C language.
.. toctree::
:maxdepth: 1
hpvm-c
hpvm-specification
compilation-process
......@@ -117,35 +117,3 @@ Atomically computes the bitwise XOR of ``v`` and the value stored at memory loca
``void __hpvm__barrier()``:raw-html-m2r:`<br>`
Local synchronization barrier across dynamic instances of current leaf node.
Porting a Program from C to HPVM-C
==================================
The following represents the required steps to port a regular C program into an HPVM program with HPVM-C. These steps are described at a high level; for more detail, please see `hpvm-cava </hpvm/test/benchmarks/hpvm-cava>`_ provided in `benchmarks </hpvm/test/benchmarks>`_.
* Separate the computation that will become a kernel into its own (leaf node) function and add the attributes and target hint.
* Create a level 1 wrapper node function that will describe the thread-level parallelism (for the GPU). The node will:
* Use the ``createNode[ND]()`` method to create a kernel node and specify how many threads will execute it.
* Bind its arguments to the kernel arguments.
* If desired, create a level 2 wrapper node function which will describe the threadblock-level parallalism (for the GPU). This node will:
* Use the ``createNode[ND]()`` method to create a level 1 wrapper node and specify how many threadblocks will execute it.
* Bind its arguments to its child node's arguments.
* A root node function that creates all the top-level wrapper nodes, binds their arguments, and connects their edges.
* Each root node represents a DFG.
* All the above node functions have the combined arguments of all the kernels that are nested at each level.
* The host code will have to include the following:
* Initialize the HPVM runtime using the ``init()`` method.
* Create an argument struct for each DFG and assign its member variables.
* Add all the memory that is required by the kernel into the memory tracker.
* Launch the DFG by calling the ``launch()`` method on the root node function, and passing the corresponding argument struct.
* Wait for the DFG to complete execution.
* Read out any generated memory using the ``request_mem()`` method.
* Remove all the tracked memory from the memory tracker.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment