[llvm] [AMDGPU] Avoid setting GLC for image atomics when result is unused (PR #150742)

Sun Jul 27 02:09:48 PDT 2025

================
@@ -8780,8 +8780,10 @@ SDValue SITargetLowering::lowerImage(SDValue Op,
   }
 
   unsigned CPol = Op.getConstantOperandVal(ArgOffset + Intr->CachePolicyIndex);
-  if (BaseOpcode->Atomic)
-    CPol |= AMDGPU::CPol::GLC; // TODO no-return optimization
+  // Keep GLC only when the atomic's result is actually used.
+  if (BaseOpcode->Atomic && !Op.getValue(0).use_empty())
+    CPol |= AMDGPU::CPol::GLC;
+
----------------
harrisonGPU wrote:

Do you mean I need to change the opcode to an image atomic no-return type?
I haven’t seen such an instruction , I searched the programming guide and didn’t find one.
I’ve updated the patch to not drop the output register.
I believe we only need to set GLC to 0 when the result of the image atomic is unused, as the programming guide says:
> Group Level Coherent - controls behavior of L0 cache. Atomics: 1 = return the memory value before the
atomic operation is performed. 0 = do not return anything.
I also noticed that flat atomics support GLC=0 without requiring an opcode change, so I assumed the same applies here.
What do you think? :-)

https://github.com/llvm/llvm-project/pull/150742