GPU backend
Adds GPU backend and "CUDA" feature to all the tests. As of MR creation, there were no forks in tested IRs so only vanilla non-parallel (single block, single thread) codegen has been tested.
Merge request reports
Activity
requested review from @rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
- hercules_cg/src/gpu.rs 0 → 100644
672 } 673 } 674 675 /* 676 * This analysis determines the parallelization strategy within threadblocks. 677 * We run post-order traversal on the fork tree to get the thread quota per 678 * subtree. In particular, each fork starts with a base factor as the 679 * maximum over its descendants (leafs have base 1). We traverse up (details 680 * in helper) and pass the factor and a map from fork node to a tuple of 681 * (max quota of its siblings (including itself), its quota, its fork factor) 682 * from each node to its parents. The parent then compares 683 * - all three are needed for codegen. A node is in the map IFF it will be parallelized. 684 * If not, the fork will use the parent's quota and serialize over the Fork's 685 * ThreadIDs. Nodes may be removed from the map when traversing up the tree 686 * due to an ancestor having a larger factor that conflicts. 687 */ - Resolved by rarbore2
- Resolved by rarbore2
- Resolved by rarbore2
added 23 commits
-
70c06a3b...de78461b - 22 commits from branch
main - 86f2e5b8 - not work yet
-
70c06a3b...de78461b - 22 commits from branch
- hercules_cg/src/fork_tree.rs 0 → 100644
1 use std::collections::{HashMap, HashSet}; 2 3 use crate::*; 4 5 /* 6 * Construct a map from fork node to all control nodes (including itself) satisfying: 7 * a) domination by F 8 * b) no domination by F's join 9 * c) no domination by any other fork that's also dominated by F, where we do count self-domination Ah, I see that this is actually the wrong condition. The condition should be post-dominated by the join, and I see that this needs to be fixed in the fork-join nesting as well. I'll fix this later. My bad.
Edited by rarbore2
- Resolved by rarbore2
Please register or sign in to reply