[Mlir-commits] [mlir] [mlir][gpu] Add address space modifier to Barrier (PR #110527)
Jakub Kuderski
llvmlistbot at llvm.org
Mon Sep 30 09:22:25 PDT 2024
================
@@ -1371,11 +1372,25 @@ def GPU_BarrierOp : GPU_Op<"barrier"> {
accessing the same memory can be avoided by synchronizing work items
in-between these accesses.
+ The address space of visible memory accesses can be modified by adding a
+ list of address spaces required to be visible. By default all address spaces
+ are included.
----------------
kuhar wrote:
Some good points there, @krzysz00. I only know the amdgpu implementation and that's why I'd like to make sure the semantics make sense for everyone in a portable way.
The way I understand it, it's backend compiler's job to track any memory dependencies / synchronization within the same subgroup, and we only need `gpu.barrier` (the plain) version when the dependency can be across subgroups.
> (This would allow us to not necessarily have amdgpu.lds_barrier - since we'd be able to express that as gpu.barrier [#gpu.address_space<workgroup>]
I don't see how this aligns with the stated goal in the RFC:
> This could reduce the cost of synchronization.
> All work-items in the workgroup are still required to reach the barrier, but the address space visibility can be reduced.
lds_barrier is strictly more work than just gpu.barrier, no? It's `s_barrier` and 'please flush the shared memory fifo' . At least that's what c++ libraries do: https://github.com/ROCm/composable_kernel/blob/de3e3b642402eac5b4a466f6a2fa5e9f022ba680/include/ck/utility/synchronization.hpp#L20-L25.
Maybe I'm missing something.
https://github.com/llvm/llvm-project/pull/110527
More information about the Mlir-commits
mailing list