[Mlir-commits] [mlir] [mlir][gpu] Add address space modifier to Barrier (PR #110527)

Mon Sep 30 09:22:25 PDT 2024

================
@@ -1371,11 +1372,25 @@ def GPU_BarrierOp : GPU_Op<"barrier"> {
     accessing the same memory can be avoided by synchronizing work items
     in-between these accesses.
 
+    The address space of visible memory accesses can be modified by adding a
+    list of address spaces required to be visible. By default all address spaces
+    are included.
----------------
kuhar wrote:

Some good points there, @krzysz00. I only know the amdgpu implementation and that's why I'd like to make sure the semantics make sense for everyone in a portable way.

The way I understand it, it's backend compiler's job to track any memory dependencies / synchronization within the same subgroup, and we only need `gpu.barrier` (the plain) version when the dependency can be across subgroups.

> (This would allow us to not necessarily have amdgpu.lds_barrier - since we'd be able to express that as gpu.barrier [#gpu.address_space<workgroup>]

I don't see how this aligns with the stated goal in the RFC:
> This could reduce the cost of synchronization.

> All work-items in the workgroup are still required to reach the barrier, but the address space visibility can be reduced.

lds_barrier is strictly more work than just gpu.barrier, no? It's `s_barrier` and 'please flush the shared memory fifo' . At least that's what c++ libraries do: https://github.com/ROCm/composable_kernel/blob/de3e3b642402eac5b4a466f6a2fa5e9f022ba680/include/ck/utility/synchronization.hpp#L20-L25.

Maybe I'm missing something.

https://github.com/llvm/llvm-project/pull/110527