[llvm] [AMDGPU]: Rewrite mbcnt_lo/mbcnt_hi to work item ID where applicable (PR #160496)

Thu Oct 2 23:21:56 PDT 2025

================
@@ -2113,6 +2119,181 @@ INITIALIZE_PASS_DEPENDENCY(UniformityInfoWrapperPass)
 INITIALIZE_PASS_END(AMDGPUCodeGenPrepare, DEBUG_TYPE, "AMDGPU IR optimizations",
                     false, false)
 
+bool AMDGPUCodeGenPrepareImpl::visitMbcntLo(IntrinsicInst &I) {
+  // On wave32 targets, mbcnt.lo(~0, 0) can be replaced with workitem.id.x.
+  if (!ST.isWave32())
+    return false;
+
+  // Check for pattern mbcnt.lo(~0, 0).
+  auto *Arg0C = dyn_cast<ConstantInt>(I.getArgOperand(0));
+  auto *Arg1C = dyn_cast<ConstantInt>(I.getArgOperand(1));
+  if (!Arg0C || !Arg1C || !Arg0C->isAllOnesValue() || !Arg1C->isZero())
+    return false;
+
+  // Check reqd_work_group_size similar to mbcnt_hi case.
+  Function *F = I.getFunction();
+  if (!F)
+    return false;
+
+  unsigned Wave = 0;
+  if (ST.isWaveSizeKnown())
----------------
arsenm wrote:

This kind of shouldn't be possible, but should also just abort the transform if it's not known 

https://github.com/llvm/llvm-project/pull/160496