Multi return
Multi-return, we're now able to properly return multiple collections and scalars, and we get products out in the host Rust like we really want. Updated backprop to use multi returns.
Notes
- Rewrote IP SROA, it now converts product return values into multi-returns. IP SROA now takes a boolean indicating whether it should sroa products that contain arrays (like SROA takes) and the selection used in the schedule should specify what functions to run IP SROA on, the pass manager will pass fully mutable editors to all functions to the pass and IP SROA will only change the return type of the specified functions (but will update all call sites to it).
- The interface functions for CPU and GPU will return the value is the function is single-return and otherwise will return its values via a struct passed in.
- For GPU functions, we still only require an output from the kernel if some return value must be returned from the GPU. If there are values that have to be returned we pass the full return struct (with all returned fields) to the GPU but it only sets the values that must be returned from the kernel, the remainder are set on the host. With single-return functions the interface function will copy the gpu result to a stack allocation and then extract the value while for multi-return functions it will copy directly into the return struct pointer to avoid an extra copy on the host.
- The RT backend generates wrappers around multi-return CPU/GPU functions (where it used to just declare the
extern
function) with this wrapper handling the allocation of the output struct, passing it in to the device function, and then collecting the results into a Rust product like the multi-return Rust Async functions return. - Updated the runner's
run
function lifetimes significantly, now generates a separate lifetime for the runner object itself, each return, and each parameter and uses lifetime bounds to ensure the runner and arguments are borrowed appropriately based on the lifetimes of the returned references. (Rust's syntax is actually really nice for code-generation, it accepts things like<'a:, 'b>
and<'a: 'b +, 'b>
).
Merge request reports
Activity
requested review from @rarbore2
assigned to @aaronjc4
mentioned in issue #29 (closed)
- Resolved by rarbore2
Very nice! For bullet 3, does "if some return value must be returned from the GPU" refer to not needing to return a pointer from the kernel if the pointer to be returned is known (is the root of some collection from a parameter or a constant)?
Main reason is to be able to give call nodes some type. We might be able to get rid of it if we just don't assign types to call nodes (only their data projections); it is also used in the RT back-end to generate appropriate types for the value returned by call nodes but this could be addressed in some other way.
MultiReturn
is not a control type because it doesn't represent control tokens or anything relating to control flow, it's a data type but one that can only be used by data projections.
mentioned in commit 6cb6c003