Data-parallel multi-core RT codegen
- Use
prettypleaseto auto-format generated RT Rust code. - Refactor RT backend to allow for nested async closure environments.
- Lower fork, join, thread ID, and reduce nodes in RT backend (only
ParallelForkandParallelReducenodes though!). - QOL make sure there's an entry for start node in fork join nests.
- Add
__RawPtrSendSynctohercules_rtand in generated RT code to move pointers across async boundaries. - Re-schedule
test6infork_join_teststo generate a tiled fork-join, half in RT code and half in CPU code to test multicore.