Skip to content
Snippets Groups Projects
Commit 599b8a6d authored by Vikram Adve's avatar Vikram Adve
Browse files

Revise to have less motivation, get to the point a bit quicker,

say explicitly how we are better than existing virtual ISAs, and
say our compiler prototype (not our compilation strategy) is preliminary.
parent 0cfc4b4b
No related branches found
No related tags found
No related merge requests found
\begin{abstract}
Heterogeneous computing systems are expected to provide a solution to the power
wall problem as they bring specialization to the table. However, programming
such systems is getting increasingly complex with the addition of diverse
compute elements on a single chip. These computing elements use different
parallelism models, instruction sets, and memory hierarchy, making it difficult
to achieve performance and code portability on heterogeneous systems.
Application programming for such systems would be greatly simplified if a single
object code representation could be used to generate code for different compute
units in a heterogeneous system. Previous efforts aiming to address the source
and object code portability challenges arising in such systems, such as OpenCL,
CUDA, SPIR, PTX and HSAIL focus heavily on GPUs, which makes them insufficient
for today's SoCs.
Virtual Instruction Set Computing (VISC) is a powerful approach to better
portability. We propose to use it to address the code and performance
portability problem for heterogeneous mobile SoCs. In this paper we focus on the
crux of VISC approach and present a novel virtual ISA design which adds dataflow
graph abstractions to LLVM IR, to capture the diverse forms of parallelism
models exposed by today's SoCs. We also present a compilation strategy to
generate code for AVX, PTX and X86 backends from single virtual ISA
representation of a program. Through a set of experiments we show that code
generated for CPUs and GPUs from single virtual ISA representation achieves
acceptable performance, compared with hand-tuned code. We further demonstrate
that these virtual ISA abstractions are also suited for capturing pipelining and
%
Heterogeneous computing is widely used in the System-on-chip (SoC) processors
that power modern mobile devices in order to
reduce power consumption through specialization.
However, programming such systems can be extremely complex as a single
SoC combines multiple different
parallelism models, instruction sets, and memory hierarchies, and different
SoCs use \emph{different combinations} of these features.
We propose a new Virtual Instruction Set Architecture (ISA) that aims to
address both functional portability and performance portability across
mobile heterogeneous SoCs by capturing the wide range of different
parallelism models expected to be available on future SoCs.
Our virtual ISA design uses only two parallelism models to achieve this goal:
\emph{a hierarchical dataflow graph with side effects} and
\emph{parametric vector instructions}.
Our virtual ISA is more general than existing ones that focus heavily on GPUs,
such as PTX, HSAIL and SPIR, e.g., it can capture both streaming pipelined
parallelism and general dataflow parallelism found in many custom and
semi-custom (programmable) accelerators.
We present a compilation strategy to generate code for a diverse range
of target hardware components from the common virtual ISA.
As a first prototype, we have implemented backends for
GPUs that use nVidia's PTX,
vector hardware using Intel's AVX, and
host code running on X86 processors.
Experimental results show that code generated for vectors and GPUs
from a single virtual ISA representation achieves
performance that is within about a factor of 2x of separately hand-tuned code,
and much closer in most cases.
We further demonstrate qualitatively using a realistic example
that our virtual ISA abstractions are also suited for capturing pipelining and
streaming parallelism.
%
\end{abstract}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment