[llvm] [AMDGPU]: Rewrite mbcnt_lo/mbcnt_hi to work item ID where applicable (PR #160496)
Teja Alaghari via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 30 00:42:07 PDT 2025
================
@@ -2113,6 +2119,147 @@ INITIALIZE_PASS_DEPENDENCY(UniformityInfoWrapperPass)
INITIALIZE_PASS_END(AMDGPUCodeGenPrepare, DEBUG_TYPE, "AMDGPU IR optimizations",
false, false)
+bool AMDGPUCodeGenPrepareImpl::visitMbcntLo(IntrinsicInst &I) {
+ // On wave32 targets, mbcnt.lo(~0, 0) can be replaced with workitem.id.x
+ if (!ST.isWave32())
+ return false;
+
+ // Check for pattern mbcnt.lo(~0, 0)
+ auto *Arg0C = dyn_cast<ConstantInt>(I.getArgOperand(0));
+ auto *Arg1C = dyn_cast<ConstantInt>(I.getArgOperand(1));
+ if (!Arg0C || !Arg1C || !Arg0C->isAllOnesValue() || !Arg1C->isZero())
+ return false;
+
+ // Check reqd_work_group_size similar to mbcnt_hi case
+ Function *F = I.getFunction();
+ if (!F)
+ return false;
+
+ unsigned Wave = 0;
+ if (ST.isWaveSizeKnown())
+ Wave = ST.getWavefrontSize();
+
+ if (auto MaybeX = ST.getReqdWorkGroupSize(*F, 0)) {
----------------
TejaX-Alaghari wrote:
Done
https://github.com/llvm/llvm-project/pull/160496
More information about the llvm-commits
mailing list