[llvm] [AMDGPU] Avoid setting GLC for image atomics when result is unused (PR #150742)
Harrison Hao via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 19 02:40:54 PDT 2025
harrisonGPU wrote:
> Agree with Matt. Although the ISA opcode is the same, you should define separate MachineInstrs for the no-return forms which do not have the def operand for the returned value.
>
> The current patch has two problems:
>
> 1. The register allocator will still allocate a register for the unused result value, which is a bit wasteful, and could actually increase vgpr usage in some cases.
> 2. These MachineInstrs still satisfy SIInstrInfo::isAtomicRet so SIInsertWaitcnts thinks that they increment VMCNT (aka LOADCNT) which will cause it to insert incorrect waitcnts in some cases.
Thanks, Jay. I agree with your point. I'm working on implementing no-return type image atomic intrinsics and instructions.
Even though current image atomics return a value, we usually avoid setting GLC when the result is unused, to avoid unnecessary cache. Maybe we need to think about to merge it?
I'm happy to work on adding proper no-return variants, this is my first time writing tablegen. :-)
https://github.com/llvm/llvm-project/pull/150742
More information about the llvm-commits
mailing list