[llvm] [AMDGPU] Optimize block count calculations to the new ABI (PR #174112)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 31 14:14:17 PST 2025
================
@@ -322,6 +325,48 @@ static bool processUse(CallInst *CI, bool IsV5OrAbove) {
}
}
+ // Upgrade the old method of calculating the block size using the grid size.
+ // We pattern match any case where the implicit argument group size is the
+ // divisor to a dispatch packet grid size read of the same dimension.
+ if (IsV5OrAbove && llvm::any_of(GroupSizes, [](Value *V) { return V; })) {
+ for (int I = 0; I < 3; I++) {
+ Value *GroupSize = GroupSizes[I];
+ if (!GroupSize)
+ continue;
+
+ for (User *U : GroupSize->users()) {
+ Instruction *Inst = dyn_cast<Instruction>(U);
+ if (isa<ZExtInst>(Inst))
+ Inst = Inst->getNextNode();
----------------
arsenm wrote:
Don't rely on instruction ordering. getOneUser? user_begin?
https://github.com/llvm/llvm-project/pull/174112
More information about the llvm-commits
mailing list