[llvm] [AMDGPU] Allow merging unordered and monotonic atomic loads in SILoadStoreOptimizer (PR #189932)
Harrison Hao via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 1 21:16:26 PDT 2026
harrisonGPU wrote:
> > I think this optimization is not suitable for IR. Combining atomic operations at the IR level changes the number of atomic events and can violate the memory model. This is why the transformation is implemented at the MachineIR level.
>
> That doesn't change when you perform it in machine IR. This is valid or it's not, it doesn't become legal just by doing it in MIR
Thanks, Matt, you're right that the legality of the transformation doesn't depend on the IR level.
However, the reason for doing this in MachineIR rather than LLVM IR is practical. The LLVM Atomics guide states:
`"atomic instructions are guaranteed to be lock-free, and therefore an instruction which is wider than the target natively supports can be impossible to generate."`
Merging two 32 bit load atomic into a load atomic 64 bit at the IR level would require every backend to support a lock free 64 bit atomic load. For the b128 case, it would require load atomic 128 bit, it might cause some other backend fail. At the MachineIR level, we already know the target supports the wider load natively, so there is no risk of codegen failure. Do you have some good suggestions?
Reference: https://llvm.org/docs/Atomics.html#atomic-instructions
https://github.com/llvm/llvm-project/pull/189932
More information about the llvm-commits
mailing list