[Mlir-commits] [mlir] [mlir][GPU] Extend gpu.barrier with scope and named-barrier support (PR #195692)
Guray Ozen
llvmlistbot at llvm.org
Mon May 4 12:56:17 PDT 2026
================
@@ -1443,35 +1451,92 @@ def GPU_BarrierOp : GPU_Op<"barrier">,
accessing the same memory can be avoided by synchronizing work items
in-between these accesses.
- If the `memfence` attribute is specified, the set of memory accesses that must
- by completed after the barrier resolves is limited to only those accesses that
- read from or write to the specified address spaces (though accesses to other
- address spaces may be completed as well, especially if a particular combination
- of address spaces is not supported on a given backend). In particular,
- specifying `memfence []` creates a barrier that is not required to affect
- the visibility of any memory operations and is purely used for synchronizing
- work items.
+ The `scope` attribute controls the execution scope of the barrier:
```mlir
- // Only workgroup address spaces accesses required to be visible.
+ // Synchronize within a subgroup (warp/wavefront).
+ gpu.barrier scope <subgroup>
+ // Synchronize across the entire device.
+ gpu.barrier scope <device>
+ ```
+
+ A `named` barrier allows synchronizing a specific subset of subgroups
+ that have been associated with a named barrier handle. Named barriers
+ require workgroup scope.
+
+ ```mlir
+ // Initialize a named barrier for 4 participating members.
+ %nb = gpu.initialize_named_barrier %c4 : i32 -> !gpu.named_barrier
+ // Wait on the named barrier.
+ gpu.barrier named(%nb : !gpu.named_barrier)
+ ```
+
+ If the `memfence` attribute is specified, the set of memory accesses that
+ must be completed after the barrier resolves is limited to only those
+ accesses that read from or write to the specified address spaces. In
+ particular, specifying `memfence []` creates a barrier that is not required
+ to affect the visibility of any memory operations and is purely used for
+ synchronizing work items.
+
+ ```mlir
+ // Only workgroup address space accesses required to be visible.
gpu.barrier memfence [#gpu.address_space<workgroup>]
// No memory accesses required to be visible.
gpu.barrier memfence []
// All memory accesses required to be visible.
gpu.barrier
```
- Either none or all work items of a workgroup need to execute this op
- in convergence.
+ The three clauses can be combined in any order, but not all combinations may
+ be supported on a given target:
+
+ ```mlir
+ // Named barrier with a workgroup-only memory fence.
+ gpu.barrier named(%nb : !gpu.named_barrier) memfence [#gpu.address_space<workgroup>]
+ // Subgroup barrier with a global fence.
+ gpu.barrier memfence [#gpu.address_space<global>] scope <subgroup>
+ ```
+
+ Once one thread of execution in a given scope (say, thread in a workgroup)
+ has executed a particular dynamic instance of `gpu.barrier`, all other threads
+ in that scope must execute the same dynamic instance of `gpu.barrier` before
+ executing any other instance of it.
+ }];
+ let assemblyFormat = [{
+ oilist(
+ `named` `(` $named_barrier `:` type($named_barrier) `)`
+ | `memfence` $address_spaces
+ | `scope` $scope
+ ) attr-dict
}];
- let assemblyFormat = "(`memfence` $address_spaces^)? attr-dict";
let hasCanonicalizer = 1;
+ let hasVerifier = 1;
let builders = [OpBuilder<(
ins CArg<"std::optional<::mlir::gpu::AddressSpace>",
"std::nullopt">:$addressSpace)>,
OpBuilder<(ins "Value":$memrefToFence)>];
}
+def GPU_InitializeNamedBarrierOp
----------------
grypp wrote:
PTX has had named barriers via `bar.sync a, b` for a long time, and the `nvvm.barrier` op already exposes them as barrierId + numberOfThreads. Has lowering `gpu.barrier named(%nb)` to that been considered?
What would `gpu.initialize_named_barrier` lower to on NVVM, given there's no init op in PTX?
https://github.com/llvm/llvm-project/pull/195692
More information about the Mlir-commits
mailing list