[llvm] [AMDGPU] Avoid setting GLC for image atomics when result is unused (PR #150742)

Harrison Hao via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 19 02:40:54 PDT 2025


harrisonGPU wrote:

> Agree with Matt. Although the ISA opcode is the same, you should define separate MachineInstrs for the no-return forms which do not have the def operand for the returned value.
> 
> The current patch has two problems:
> 
> 1. The register allocator will still allocate a register for the unused result value, which is a bit wasteful, and could actually increase vgpr usage in some cases.
> 2. These MachineInstrs still satisfy SIInstrInfo::isAtomicRet so SIInsertWaitcnts thinks that they increment VMCNT (aka LOADCNT) which will cause it to insert incorrect waitcnts in some cases.

Thanks, Jay. I agree with your point. I'm working on implementing no-return type image atomic intrinsics and instructions.

Even though current image atomics return a value, we usually avoid setting GLC when the result is unused, to avoid unnecessary cache. Maybe we need to think about to merge it?

I'm happy to work on adding proper no-return variants, this is my first time writing tablegen. :-)


https://github.com/llvm/llvm-project/pull/150742


More information about the llvm-commits mailing list