[PATCH] D144759: [AMDGPU] Implement idempotent atomic lowering
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 3 04:43:48 PST 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:13371-13377
+ // We do not need to insert a fence here, memory legalizer will do.
+ LoadInst *LI = Builder.CreateAlignedLoad(
+ AI->getType(), AI->getPointerOperand(), AI->getAlign());
+ LI->setAtomic(Order, SSID);
+ LI->copyMetadata(*AI);
+ LI->takeName(AI);
+ AI->replaceAllUsesWith(LI);
----------------
rampitec wrote:
> rampitec wrote:
> > arsenm wrote:
> > > I don't understand why this is a target hook. Why can't this unconditionally happen in the generic code?
> > Probably not. The only target implementing it is x86 and it issues target specific intrinsics. It also skips 'or' with zero as it claims to have a better lowering.
> Also note that I am skipping the fence on the grounds that memory legalizer will fence it. Otherwise with our address spaces and scopes this would be quite non-trivial and target specific code.
I also don't understand that a note. The original atomicrmw wouldn't have implied a fence to begin with?
================
Comment at: llvm/test/CodeGen/AMDGPU/idemponent-atomics.ll:30
+entry:
+ store i32 1, ptr addrspace(1) undef
+ %val = atomicrmw or ptr addrspace(1) %in, i32 0 syncscope("agent-one-as") acquire, align 4
----------------
avoid store to undef in new tests
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144759/new/
https://reviews.llvm.org/D144759
More information about the llvm-commits
mailing list