Skip to content
Snippets Groups Projects

Fork Guard Elimination

Merged Aaron Councilman requested to merge fork_guard_elim into main
1 unresolved thread

Implements the fork-guard elimination pass. The optimization is divided into an analysis step which we run on each node in the function which determines whether it is a guarded fork and if it is identifies the nodes that need to be removed and the nodes that need to be replaced (this is mostly the phi nodes which are replaced by the reduce nodes and the region which joins the guard's branches with the join node of the fork).

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • requested review from @rarbore2

    • This looks good - I thought about it some more, currently we codegen fork/joins to not have loop guards (i.e. the replication factor can't be zero). This becomes problematic, since this post-hoc imposes a restriction on dynamic constants. Imagine the following scenario:

      1. The programmer writes a loop that they fully intend should be allowed to execute 0 times.
      2. This loop is compiled to a loop-guard + end of block jump loop by the Juno frontend.
      3. The inner end of block jump loop is converted to a fork-join, so that there is now a loop guard guarding a fork-join.
      4. This pass runs, removing the loop guard. Implicitly the loop now will always run at least once.

      There's a couple ways we could address this:

      1. Make dynamic constants used in fork-joins always non-zero. This means we can only forkify a loop when the programmer uses an appropriate schedule (a "forkify" schedule on the loop header?). An effect of this schedule is to add the non-zero rule to the corresponding dynamic constant.
      2. Make the fork guard elimination pass run only when dynamic constants are known to not be zero. This means that, implicitly, many fork-joins must be wrapped in a guard, because it would be undefined behavior for a fork-join to run 0 times.
      3. Modify the codegen to generate fork-joins which can run 0 times - effectively, generate the loop guard in LLVM IR.

      I'm the most partial to solutions 1 or 3.

    • I'm most partial to solution 3, since I think the semantics of fork-join still make sense with a zero replication factor (control just immediately passes to the join and all of the reduce nodes are just set to their initial value). I don't think an additional guard in the LLVM IR should be too hard and I don't think we should be so worried about performance right now that we need to eliminate every guard; obviously it's more of a performance issue for nested loops and we may eventually want to generate code that checks all the guards outside the fork nest (at least for perfectly nested ones) though it's also possible that LLVM will be able to optimize and move the guards outside on its own, though I'm not sure.

    • Sounds good - in that event, this pass is valid as is, and adding the loop guard is something we need to do in the backend

    • Please register or sign in to reply
  • rarbore2 approved this merge request

    approved this merge request

  • mentioned in commit 12b31ed6

Please register or sign in to reply
Loading