[llvm] [AMDGPU]: Rewrite mbcnt_lo/mbcnt_hi to work item ID where applicable (PR #160496)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 2 23:21:56 PDT 2025
================
@@ -2113,6 +2119,181 @@ INITIALIZE_PASS_DEPENDENCY(UniformityInfoWrapperPass)
INITIALIZE_PASS_END(AMDGPUCodeGenPrepare, DEBUG_TYPE, "AMDGPU IR optimizations",
false, false)
+bool AMDGPUCodeGenPrepareImpl::visitMbcntLo(IntrinsicInst &I) {
+ // On wave32 targets, mbcnt.lo(~0, 0) can be replaced with workitem.id.x.
+ if (!ST.isWave32())
+ return false;
+
+ // Check for pattern mbcnt.lo(~0, 0).
+ auto *Arg0C = dyn_cast<ConstantInt>(I.getArgOperand(0));
+ auto *Arg1C = dyn_cast<ConstantInt>(I.getArgOperand(1));
+ if (!Arg0C || !Arg1C || !Arg0C->isAllOnesValue() || !Arg1C->isZero())
+ return false;
+
+ // Check reqd_work_group_size similar to mbcnt_hi case.
+ Function *F = I.getFunction();
+ if (!F)
+ return false;
+
+ unsigned Wave = 0;
+ if (ST.isWaveSizeKnown())
----------------
arsenm wrote:
This kind of shouldn't be possible, but should also just abort the transform if it's not known
https://github.com/llvm/llvm-project/pull/160496
More information about the llvm-commits
mailing list