Skip to content
Snippets Groups Projects
Commit 893ff539 authored by Maria Kotsifakou's avatar Maria Kotsifakou
Browse files

Edited Abstract.

parent 301bd344
No related branches found
No related tags found
No related merge requests found
......@@ -6,19 +6,21 @@ compute elements on a single chip. These computing elements use different
parallelism models, instruction sets, and memory hierarchy, making it difficult
to achieve performance and code portability on heterogeneous systems.
Application programming for such systems would be greatly simplified if a single
object code representation can be used to generate code for different compute
units in a heteroegenous system. Previous efforts such as OpenCL, CUDA, SPIR and
PTX aimed to address the source and object code portability challenges of such systems focus
heavily on GPUs, thus making them insufficent for today's SoCs.
object code representation could be used to generate code for different compute
units in a heteroegenous system. Previous efforts aiming to address the source
and object code portability challenges arising in such systems, such as OpenCL,
CUDA, SPIR, PTX and HSAIL focus heavily on GPUs, which makes them insufficent
for today's SoCs.
We propose VISC, a framework for programming heterogeneous systems. In this
paper, we focus on the crux of VISC, a novel virtual ISA design, which adds
dataflow graph abstractions to LLVM IR, to capture diverse forms of parallelism
models. We also present a compilation strategy to generate code for AVX, PTX and
X86 backends from single virtual ISA representation of a program. Through a set
of experiments we show that code generated for CPUs and GPUs, from single
virtual ISA representation, achieves par performance (within 1 to 1.6x) with
hand-tuned code\todo{What numbers to quote here?}. We further argue that the
virtual ISA abstractions are also suited for capturing pipeline and streaming
paralleism.
paper we focus on the crux of VISC, a novel virtual ISA design which adds
dataflow graph abstractions to LLVM IR, to capture the diverse forms of
parallelism models exposed by today's SoCs. We also present a compilation
strategy to generate code for AVX, PTX and X86 backends from single virtual ISA
representation of a program. Through a set of experiments we show that code
generated for CPUs and GPUs from single virtual ISA representation
achieves performance (within 1 to 1.6x) with hand-tuned code
\todo{What numbers to quote here?}. We further demonstrate that these virtual
ISA abstractions are also suited for capturing pipelining and streaming
parallelism.
\end{abstract}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment