[PATCH] D145524: [AMDGPU] Skip buffer_wbl2 before atomic fence acquire

Tue Mar 7 15:33:35 PST 2023

rampitec added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:2212
   if (MOI.isAtomic()) {
-    if (MOI.getOrdering() == AtomicOrdering::Acquire ||
-        MOI.getOrdering() == AtomicOrdering::Release ||
+    if (MOI.getOrdering() == AtomicOrdering::Acquire)
+      Changed |= CC->insertWait(MI, MOI.getScope(), MOI.getOrderingAddrSpace(),
----------------
t-tye wrote:
> Why is this waitcnt being added, I do not believe the memory model requires this. An acquire does not require previous memory operations to be complete. It only requires that the location being loaded has completed, followed by invalidating the caches consistent with the scope.
Memory model says this is the sequence:

```
     fence        acquire      - agent        *none*     1. s_waitcnt lgkmcnt(0) &
                                                            vmcnt(0)
```
And then the same for the rest of the fence cases. The code produced looks consistent with the memory model to me.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145524/new/

https://reviews.llvm.org/D145524