Skip to content
Snippets Groups Projects
Commit 600e53b6 authored by Yifan Zhao's avatar Yifan Zhao
Browse files

Getting started (documentation)

parent 07962ffd
No related branches found
No related tags found
No related merge requests found
Getting Started
===============
TODO: this is the system-wide tour Sasa was suggesting. Finish this.
This tutorial covers the basic usage of all components in HPVM
(components listed :doc:`here </components/index>`).
We will generate a DNN model, AlexNet2 (for CIFAR10 dataset), into HPVM code, compile it with HPVM,
perform autotuning on the compiled binary to find approximation choices (configurations),
and profile the selected configurations to get real performance on device.
The result will be a figure showing the accuracy-performance tradeoff of AlexNet2 over the
(pre-defined) approximations and the configurations in a few formats.
Please check ``test/dnn_benchmarks/model_params/`` exists and contains
``alexnet2_cifar10/`` and ``pytorch/alexnet2_cifar10.pth.tar``,
which may not be the case if you opted out of model parameter download in the installer.
In that case, you may run the installer again to download the parameter.
It will not rebuild everything from scratch.
Generating and Compiling a DNN Model
------------------------------------
Below we will use `torch2hpvm`, the PyTorch API as an example.
This package lives at ``projects/torch2hpvm`` and should have been installed by the installer.
The Keras frontend serves a similar purpose, and its usage can be found in the
:doc:`documentation </components/keras-frontend>`.
*Note* that below we'll be working under directory ``test/dnn_benchmarks``,
for easier access to ``test/dnn_benchmarks/model_params/``.
You can also symlink it to other locations -- don't move it: it's used in test cases --
and adjust the paths below accordingly.
First, prepare 2 datasets for autotuning and testing for AlexNet2.
These datasets are provided as ``model_params/alexnet2_cifar10/{tune|test}_{input|labels}.bin``,
where ``tune`` and ``test`` prefixes signify tuning and testing set.
.. code-block:: python
from torch2hpvm import BinDataset
from pathlib import Path
data_dir = Path("model_params/alexnet2_cifar10")
dataset_shape = 5000, 3, 32, 32 # NCHW format.
tuneset = BinDataset(data_dir / "tune_input.bin", data_dir / "tune_labels.bin", dataset_shape)
testset = BinDataset(data_dir / "test_input.bin", data_dir / "test_labels.bin", dataset_shape)
`BinDataset` is a utility `torch2hpvm` provides for creating dataset over binary files.
Any instance `torch.utils.data.Dataset` can be used here.
*Note* that each `module` is bound to 2 datasets: a "tune" and a "test" set.
The generated binary accepts an argument to be either the string "tune" or "test",
and performs inference over a dataset accordingly.
Create a DNN `module` and load the checkpoint:
.. code-block:: python
import torch
from torch.nn import Module
import dnn # Defined at `hpvm/test/dnn_benchmarks/pytorch`
model: Module = dnn.AlexNet2()
checkpoint = "model_params/alexnet2_cifar10.pth.tar"
model.load_state_dict(torch.load(checkpoint))
Any `torch.nn.Module` can be similarly used,
as long as they only contain the tensor operators supported in HPVM.
See "Supported Operators" in :doc:`PyTorch frontend <components/torch2hpvm>`
and :doc:`Keras frontend <components/keras-frontend>`.
Now we are ready to export the model. The main functioning class of `torch2hpvm` is `ModelExporter`:
.. code-block:: python
from torch2hpvm import ModelExporter
output_dir = Path("./alexnet2_cifar10")
build_dir = output_dir / "build"
target_binary = build_dir / "alexnet2_cifar10"
batch_size = 500
conf_file = "hpvm-c/benchmarks/alexnet2_cifar10/data/tuner_confs.txt"
exporter = ModelExporter(model, tuneset, testset, output_dir, config_file=conf_file)
exporter.generate(batch_size=batch_size).compile(target_binary, build_dir)
`output_dir`, `build_dir`, and `target_binary` define the folder for code generation, compilation,
and path to the compiled binary respectively.
`batch_size` is the batch size the binary uses during inference.
*
*Note* that `conf_file` is the path to an HPVM approximation configuration file.
This file decides what approximation the binary will use during inference.
This path is hardcoded into the binary and is only read when the binary starts,
so it's fine to have `conf_file` point to a non-existing path.
An example can be found at ``hpvm-c/benchmarks/alexnet2_cifar10/data/tuner_confs.txt``.
* `exporter.generate` generates the HPVM-C code while `exporter.compile` is
a helper that invokes the HPVM compiler for you.
Now there should be a binary at ``./alexnet2_cifar10/build/alexnet2_cifar10``.
Try running ``./alexnet2_cifar10/build/alexnet2_cifar10 test`` for inference over the test set.
Compiling a Tuner Binary
------------------------
The previous binary is used for inference purpose.
To use the autotuner, we need a slightly different binary that can talk with the tuner.
The following code is almost identical to the last code block,
but it adds `target="hpvm_tensor_inspect"` to `ModelExporter`,
to require an autotuner binary.
It also doesn't define a `conf_file`.
.. code-block:: python
from torch2hpvm import ModelExporter
tuner_output_dir = Path("./alexnet2_cifar10_tuner")
tuner_build_dir = tuner_output_dir / "build"
tuner_binary = tuner_build_dir / "alexnet2_cifar10"
exporter = ModelExporter(model, tuneset, testset, tuner_output_dir, target="hpvm_tensor_inspect")
exporter.generate(batch_size=500).compile(tuner_binary, tuner_build_dir)
This binary is generated at ``alexnet2_cifar10_tuner/build/alexnet2_cifar10``.
It waits for autotuner signal and doesn't run on its own, so don't run it by yourself.
Instead, import and use the tuner `predtuner`:
.. code-block:: python
from predtuner import PipedBinaryApp, config_pylogger
# Set up logger to put log file in /tmp
msg_logger = config_pylogger(output_dir="/tmp", verbose=True)
metadata_file = output_dir / exporter.metadata_file_name
# Create a `PipedBinaryApp` that communicates with HPVM bin.
# "TestHPVMApp" is an identifier of this app (used in logging, etc.) and can be anything.
# Other arguments:
# base_dir: which directory to run binary in (default: the dir the binary is in)
# qos_relpath: the name of accuracy file generated by the binary.
# Defaults to "final_accuracy". For HPVM apps this shouldn't change.
# model_storage_folder: where to put saved P1/P2 models.
app = PipedBinaryApp(
"TestHPVMApp",
tuner_binary,
metadata_file,
# Where to serialize prediction models if they are used
# For example, if you use p1 (see below), this will leave you a
# tuner_results/vgg16_cifar10/p1.pkl
# which can be quickly reloaded the next time you do tuning with
model_storage_folder="tuner_results/vgg16_cifar10",
)
tuner = app.get_tuner()
tuner.tune(
max_iter=1000, # Number of iterations in tuning. In practice, use at least 5000, or 10000.
qos_tuner_threshold=3.0, # QoS threshold to guide tuner into
qos_keep_threshold=3.0, # QoS threshold for which we actually keep the configurations
is_threshold_relative=True, # Thresholds are relative to baseline -- baseline_acc - 3.0
take_best_n=50, # Take the best 50 configurations,
cost_model="cost_linear", # Use linear performance predictor
qos_model="qos_p1", # Use P1 QoS predictor
)
fig = tuner.plot_configs(show_qos_loss=True)
fig.savefig("configs.png", dpi=300)
app.dump_hpvm_configs(tuner.best_configs, "hpvm_confs.txt")
*
*Note* that the performance shown here is estimated.
The next section talks about profiling (on a different machine),
which shows the real performance.
* Arguments `cost_model` and `qos_model` controls the models used in tuning.
No models are used when the argument is omitted.
For example, you can do an empirical tuning run by removing `qos_model="qos_p1"`.
* `cost_model="cost_linear"` estimates the performance of a configuration
using the FLOPs of each operator and the FLOPs reduction of each approximation.
If you are tuning on the end device that you wish to run the inference on, (which is a rare case),
then removing this argument will make the tuner use real performance instead.
In that case, you may skip the profiling step.
This tuning process should take a few minutes to half an hour,
depending on your GPU performance.
After the tuning finishes, the tuner will
* generate a figure showing the performance-accuracy tradeoff, at ``./configs.png``, and
* save the HPVM config format (write-only) at ``./hpvm_confs.txt``.
It is also possible to save the configuration in other formats
(see the :doc:`predtuner documentation <components/predtuner>`).
Profiling the Configurations
----------------------------
We will use `hpvm_profiler`, another Python package for profiling the ``./hpvm_confs.txt``
we obtained in the tuning step.
* The profiler uses the *plain* binary generated in the beginning (its path is `target_binary`)
instead of the tuner binary.
*
*Note* that you may want to run this profiling step on the edge device
where the performance gain is desired.
As the compiled binary is usually not compatible across architectures,
you need to install HPVM on the edge device and recompile the model.
*
*Also note* that currently,
the approximation implementations in the tensor runtime are tuned for Jetson TX2,
and speedup may be less for other architectures.
.. code-block:: python
from hpvm_profiler import profile_configs, plot_hpvm_configs
# Set `target_binary` to the path of the plain binary.
target_binary = "./alexnet2_cifar10/build/alexnet2_cifar10"
# Set `config_file` to the config file produced in tuning, such as "hpvm_confs.txt".
config_file = "hpvm_confs.txt"
out_config_file = "hpvm_confs_profiled.txt"
profile_configs(target_binary, config_file, out_config_file)
plot_hpvm_configs(out_config_file, "configs_profiled.png")
``hpvm_confs_profiled.txt`` contains the profiled configurations in HPVM format,
while ``configs_profiled.png`` shows the final performance-accuracy tradeoff curve.
An example of ``configs_profiled.png`` looks like this (proportion of your image may be different):
.. image:: tradeoff-curves/alexnet2_cifar10.png
-----------------------
This concludes the whole workflow of HPVM.
For more detailed usages, check out the documentation of each component listed
:doc:`here <components/index>`.
hpvm/docs/tradeoff-curves/alexnet2_cifar10.png

25 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment