ApproxHPVM Tensor Runtime
Getting Started
Dependencies
-
CUDA-9.1 or above
- Your device must have a CUDA-enabled nVidia GPU
- CUBLAS-9.1 or above - included with CUDA by default
-
cuDNN-7.0 or above
-
cmake >= 3.18
-
make >= 4
-
gcc < 8
or3.2 <= clang < 9
- We have an upperbound for compiler version because CUDA doesn't support too recent compilers
Building the Tensor Runtime
The following commands will compile the tensor runtime library (build/libtensor_runtime.a
)
as well as a number of exemplary benchmarks (DNN models):
mkdir build && cd build
cmake ../
make -j
Tensor Runtime APIs
-
tensor_runtime/include/tensor_runtime.h
declares all the functions available in the runtime.TODO: the tensor runtime is generally under-documented at the time. More documentation will be added in the first public release.
-
For examples of using
tensor_runtime
functions, seednn_sources/src/alexnet_cifar10.cc
.- Also, try running
build/alexnet_cifar10
which is compiled from that file and runnable out of the box.
- Also, try running
Developer Notes
Directory Structure
-
./tensor_runtime:
- ./tensor_runtime/include/: Include files for Tensor Runtime
- ./tensor_runtime/include/tensor_signatures.cc: Include file with Tensor RT signatures
- NOTE: UPDATE this with updated API
- ./tensor_runtime/src/: HPVM TensorRT sources
-
./dnn_sources:
- ./dnn_sources/src/${BENCH}.cc: Per Bench FULL-precision source
- ./dnn_sources/src/half/${BENCH}.cc: Per Bench HALF-precision source
- ./dnn_sources/src/promise/${BENCH}.cc: Per Bench Layer-API source
-
./bin:
- ./bin/install_runtime.sh: Script for moving Tensor RT files to ./lib
- ./bin/run_autotuner.py: Python script for running Autotuner experiments
- ./bin/setup_tyler_paths.sh: Tyler-specific path setup for Tensor RT
- ./bin/setup_jetson.sh: Jetson board specific path setup for Tensor RT
- ./bin/setup_cuda_paths.sh: Place-holder script for setting CUDA paths
- ./bin/swing_selection.py: Script for hardware mapping
- NOTE: Includes the L2,L1 norm mapping to hardware knobs