diff --git a/hpvm/docs/build-hpvm.rst b/hpvm/docs/build-hpvm.rst index 43f6e0f67fc58ba7e7b2e7386e8c55191eb801eb..f01548d8393210d4f5d0916a55cec881e703246c 100644 --- a/hpvm/docs/build-hpvm.rst +++ b/hpvm/docs/build-hpvm.rst @@ -44,7 +44,7 @@ Python Environment It is strongly recommended to use some Python virtual environment, as HPVM will install a few Python packages during this installation process. -* Some HPVM Python packages contains executables. If you don't use a virtual environment, +* Some HPVM Python packages contain executables. If you don't use a virtual environment, these executables are installed to your local ``bin`` directory, usually ``$HOME/.local/bin``. Please ensure this directory is in your `$PATH` variable. Below it is assumed that these executables are visible through `$PATH`. diff --git a/hpvm/docs/developerdocs/backend-passes.rst b/hpvm/docs/developerdocs/backend-passes.rst index ceea1be303310b6e6640c4dac12c82aafaabd421..b751688d18e27a802a8068d2e0a2d29a31188483 100644 --- a/hpvm/docs/developerdocs/backend-passes.rst +++ b/hpvm/docs/developerdocs/backend-passes.rst @@ -33,7 +33,7 @@ Consider a 3 dimensional Leaf node function with the following body: p[index] = p[index] + q[index]; - __hpvm_return(4, p, pSize, q, qSize); + __hpvm__return(4, p, pSize, q, qSize); } @@ -62,7 +62,7 @@ The above example will illustrate the steps in transforming the leaf node in thi p[index] = p[index] + q[index]; - __hpvm_return(4, p, pSize, q, qSize); + __hpvm__return(4, p, pSize, q, qSize); } @@ -89,7 +89,7 @@ The above example will illustrate the steps in transforming the leaf node in thi p[index] = p[index] + q[index]; - __hpvm_return(4, p, pSize, q, qSize); + __hpvm__return(4, p, pSize, q, qSize); } diff --git a/hpvm/docs/specifications/hpvm-spec.rst b/hpvm/docs/specifications/hpvm-spec.rst index 80cec0bd8f9e593b82c77ae68ff42e870088b01e..010e3c10b95bc80ef5b1865a5797f0d5f24f45d7 100644 --- a/hpvm/docs/specifications/hpvm-spec.rst +++ b/hpvm/docs/specifications/hpvm-spec.rst @@ -1,5 +1,6 @@ -.. role:: raw-html-m2r(raw) - :format: html +.. |br| raw:: html + + <br/> HPVM Language Reference ======================= @@ -105,25 +106,25 @@ Intrinsics for Describing Graphs The intrinsics for describing graphs can only be used by internal nodes. Also, internal nodes are only allowed to have these intrinsics as part of their node function, with the exception of a return statement of the appropriate type, in order to return the result of the outgoing dataflow edges. -``i8* llvm.hpvm.createNode(i8* F)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.createNode(i8* F)`` |br| Create a static dataflow node with one dynamic instance executing node function ``F``. Return a handle to the created node. -``i8* llvm.hpvm.createNode1D(i8* F, i64 n1)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.createNode1D(i8* F, i64 n1)`` |br| Create a static dataflow node replicated in one dimension, namely ``x``, with ``n1`` dynamic instances executing node function ``F``. Return a handle to the created node. -``i8* llvm.hpvm.createNode2D(i8* F, i64 n1, i64 n2)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.createNode2D(i8* F, i64 n1, i64 n2)`` |br| Create a static dataflow node replicated in two dimensions, namely ``x`` and ``y``, with ``n1`` and ``n2`` dynamic instances in each dimension respectively, executing node function ``F``. Return a handle to the created node. -``i8* llvm.hpvm.createNode3D(i8* F, i64 n1, i64 n2, i64 n3)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.createNode3D(i8* F, i64 n1, i64 n2, i64 n3)`` |br| Create a static dataflow node replicated in three dimensions, namely ``x``, ``y`` and ``z``, with ``n1``, ``n2`` and ``n3`` dynamic instances in each dimension respectively, executing node function ``F``. Return a handle to the created node. -``i8* llvm.hpvm.createEdge(i8* Src, i8* Dst, i1 ReplType, i32 sp, i32 dp, i1 isStream)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.createEdge(i8* Src, i8* Dst, i1 ReplType, i32 sp, i32 dp, i1 isStream)`` |br| Create edge from output ``sp`` of node ``Src`` to input ``dp`` of node ``Dst``. Argument ``dp`` of ``Dst``'s node function and field ``sp`` of the return struct in ``Src``'s node function must have matching types. ``ReplType`` chooses between a one-to-one (0) or all-to-all (1) edge. ``isStream`` chooses a streaming (1) or non streaming (0) edge. Return a handle to the created edge. -``void llvm.hpvm.bind.input(i8* N, i32 ip, i32 ic, i1 isStream)``:raw-html-m2r:`<br>` +``void llvm.hpvm.bind.input(i8* N, i32 ip, i32 ic, i1 isStream)`` |br| Bind input ``ip`` of current node to input ``ic`` of child node ``N``. Argument ``ic`` of ``N``'s node function and argument ``ip`` of the current node function must have matching types. ``isStream`` chooses a streaming (1) or non streaming (0) bind. -``void llvm.hpvm.bind.output(i8* N, i32 oc, i32 op, i1 isStream)``:raw-html-m2r:`<br>` +``void llvm.hpvm.bind.output(i8* N, i32 oc, i32 op, i1 isStream)`` |br| Bind output ``oc`` of child node ``N`` to output ``op`` of current node. Field ``oc`` of the return struct in ``N``'s node function and field ``op`` of the return struct in the current node function must have matching types. ``isStream`` chooses a streaming (1) or non streaming (0) bind. Intrinsics for Querying Graphs @@ -131,19 +132,19 @@ Intrinsics for Querying Graphs The following intrinsics are used to query the structure of the DFG. They can only be used by leaf nodes. -``i8* llvm.hpvm.getNode()``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.getNode()`` |br| Return a handle to the current leaf node. -``i8* llvm.hpvm.getParentNode(i8* N)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.getParentNode(i8* N)`` |br| Return a handle to the parent in the hierarchy of node ``N``. -``i32 llvm.hpvm.getNumDims(i8* N)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.getNumDims(i8* N)`` |br| Get the number of dimensions of node ``N``. -``i64 llvm.hpvm.getNodeInstanceID.{x,y,z}(i8* N)``:raw-html-m2r:`<br>` +``i64 llvm.hpvm.getNodeInstanceID.{x,y,z}(i8* N)`` |br| Get index of current dynamic node instance of node ``N`` in dimension x, y or z respectively. The dimension must be one of the dimensions in which the node is replicated. -``i64 llvm.hpvm.getNumNodeInstances.{x,y,z}(i8* N)``:raw-html-m2r:`<br>` +``i64 llvm.hpvm.getNumNodeInstances.{x,y,z}(i8* N)`` |br| Get number of dynamic instances of node ``N`` in dimension x, y or z respectively. The dimension must be one of the dimensions in which the node is replicated. Intrinsics for Memory Allocation and Synchronization @@ -151,35 +152,35 @@ Intrinsics for Memory Allocation and Synchronization The following intrinsics are used for memory allocation and synchronization. They can only be used by leaf nodes. -``i8* llvm.hpvm.malloc(i64 nBytes)``:raw-html-m2r:`<br>` -Allocate a block of memory of size ``nBytes`` and return pointer to it. The allocated object can be shared by all nodes.:raw-html-m2r:`<br>` +``i8* llvm.hpvm.malloc(i64 nBytes)`` |br| +Allocate a block of memory of size ``nBytes`` and return pointer to it. The allocated object can be shared by all nodes. |br| *Note that the returned pointer must somehow be communicated explicitly for use by other nodes.* -``i32 llvm.hpvm.atomic.add(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.add(i8* m, i32 v)`` |br| Atomically computes the bitwise ADD of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``i32 llvm.hpvm.atomic.sub(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.sub(i8* m, i32 v)`` |br| Atomically computes the bitwise SUB of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``i32 llvm.hpvm.atomic.min(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.min(i8* m, i32 v)`` |br| Atomically computes the bitwise MIN of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``i32 llvm.hpvm.atomic.max(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.max(i8* m, i32 v)`` |br| Atomically computes the bitwise MAX of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``i32 llvm.hpvm.atomic.xchg(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.xchg(i8* m, i32 v)`` |br| Atomically computes the bitwise XCHG of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``i32 llvm.hpvm.atomic.and(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.and(i8* m, i32 v)`` |br| Atomically computes the bitwise AND of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``i32 llvm.hpvm.atomic.or(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.or(i8* m, i32 v)`` |br| Atomically computes the bitwise OR of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``i32 llvm.hpvm.atomic.xor(i8* m, i32 v)``:raw-html-m2r:`<br>` +``i32 llvm.hpvm.atomic.xor(i8* m, i32 v)`` |br| Atomically computes the bitwise XOR of ``v`` and the value stored at memory location ``[m]`` w.r.t. the dynamic instances of the current leaf node and stores the result back into ``[m]``. Returns the value previously stored at ``[m]``. -``void llvm.hpvm.barrier()``:raw-html-m2r:`<br>` +``void llvm.hpvm.barrier()`` |br| Local synchronization barrier across dynamic instances of current leaf node. Intrinsics for Graph Interaction @@ -187,31 +188,31 @@ Intrinsics for Graph Interaction The following intrinsics are for graph initialization/termination and interaction with the host code, and can be used only by the host code. -``void llvm.hpvm.init()``:raw-html-m2r:`<br>` +``void llvm.hpvm.init()`` |br| Initialization of HPVM runtime. -``void llvm.hpvm.cleanup()``:raw-html-m2r:`<br>` +``void llvm.hpvm.cleanup()`` |br| Cleanup of HPVM runtime created objects. -``void llvm.hpvm.trackMemory(i8* ptr, i64 sz)``:raw-html-m2r:`<br>` +``void llvm.hpvm.trackMemory(i8* ptr, i64 sz)`` |br| Insert memory starting at ``ptr`` of size ``sz`` in the memory tracker. ``ptr`` becomes the key for identifying this memory object. As soon as a memory object is inserted in the memory tracker it starts being tracked, and can be passed as a data item to a DFG. -``void llvm.hpvm.untrackMemory(i8* ptr)``:raw-html-m2r:`<br>` +``void llvm.hpvm.untrackMemory(i8* ptr)`` |br| Stop tracking memory object with key ``ptr``, and remove it from memory tracker. -``void llvm.hpvm.requestMemory(i8* ptr, i64 sz)``:raw-html-m2r:`<br>` +``void llvm.hpvm.requestMemory(i8* ptr, i64 sz)`` |br| If memory object with key ``ptr`` is not located in host memory, copy it to host memory. -``i8* llvm.hpvm.launch(i8* RootGraph, i8* Args, i1 isStream)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.launch(i8* RootGraph, i8* Args, i1 isStream)`` |br| Launch the execution of a top-level DFG with root node function ``RootGraph``. ``Args`` is a pointer to a packed struct, containing one field per argument of the ``RootGraph`` function, consecutively. For non-streaming DFGs with a non empty result type, ``Args`` must contain an additional field of the type ``RootGraph.returnTy``, where the result of the graph will be returned. ``isStream`` chooses between a non streaming (0) or streaming (1) graph execution. Return a handle to the invoked DFG. -``void llvm.hpvm.wait(i8* GraphID)``:raw-html-m2r:`<br>` +``void llvm.hpvm.wait(i8* GraphID)`` |br| Wait for completion of execution of DFG with handle ``GraphID``. -``void llvm.hpvm.push(i8* GraphID, i8* args)``:raw-html-m2r:`<br>` +``void llvm.hpvm.push(i8* GraphID, i8* args)`` |br| Push set of input data ``args`` (same as type included in launch) to streaming DFG with handle ``GraphID``. -``i8* llvm.hpvm.pop(i8* GraphID)``:raw-html-m2r:`<br>` +``i8* llvm.hpvm.pop(i8* GraphID)`` |br| Pop and return data from streaming DFG with handle ``GraphID``. The return type is a struct containing a field for every output of DFG. Implementation Limitations diff --git a/hpvm/projects/torch2hpvm/README.rst b/hpvm/projects/torch2hpvm/README.rst index e6ac559e7df8fa12b891e5930478d627ed46cb55..8c70769869e949b8a25270c47cd974742a39b967 100644 --- a/hpvm/projects/torch2hpvm/README.rst +++ b/hpvm/projects/torch2hpvm/README.rst @@ -128,17 +128,5 @@ when the Module is exported into ONNX: - Add -This choice of operators is largely constrained by backend (tensor_runtime) supports. +This choice of operators is largely constrained by the operators supported by the current backends (tensor_runtime). -TODOs ------ - -#. Optionally insert a Python-C interface in the generated binary to - call back into a Dataset class and read the data. - - * Needs pybind11, hardcoding of Python environment, and some fiddling with import mechanism. - -#. Expand the list of operators supported in the frontend. - - * Most ideally, create a high-level description of operators that can tie - HPVM-C intrinsics and the frontend list of operators together.