[Mlir-commits] [mlir] [mlir][AMDGPU] Add wrappers for in-memory barriers on gfx1250 (PR #180112)

llvmlistbot at llvm.org llvmlistbot at llvm.org
Fri Feb 6 08:20:19 PST 2026


================
@@ -2602,6 +2602,257 @@ struct AMDGPUPermlaneLowering : public ConvertOpToLLVMPattern<PermlaneSwapOp> {
   }
 };
 
+//===----------------------------------------------------------------------===//
+// In-LDS Barrier Operations
+//===----------------------------------------------------------------------===//
+
+// Bit layout of ds_barrier_state (as i64):
+// [63:32] init count (32 bits)
+// [31:28] phase (4 bits)
+// [27:0] pending count (28 bits)
+constexpr int32_t kDsBarrierPendingCountBitWidth = 28;
+constexpr int32_t kDsBarrierPhasePos = kDsBarrierPendingCountBitWidth;
+constexpr int32_t kDsBarrierInitCountPos = 32;
+constexpr int32_t kDsBarrierPendingCountMask =
+    (1 << kDsBarrierPendingCountBitWidth) - 1;
+
+struct DsBarrierInitOpLowering
+    : public ConvertOpToLLVMPattern<DsBarrierInitOp> {
+  Chipset chipset;
+
+  DsBarrierInitOpLowering(const LLVMTypeConverter &converter, Chipset chipset)
+      : ConvertOpToLLVMPattern<DsBarrierInitOp>(converter), chipset(chipset) {}
+
+  LogicalResult
+  matchAndRewrite(DsBarrierInitOp op, OpAdaptor adaptor,
+                  ConversionPatternRewriter &rewriter) const override {
+    if (chipset < kGfx1250)
+      return op->emitOpError("only supported on gfx1250+");
+
+    Location loc = op.getLoc();
+    Type i64 = rewriter.getI64Type();
+
+    MemRefType memrefType = cast<MemRefType>(op.getBase().getType());
+    Value ptr = getStridedElementPtr(rewriter, loc, memrefType,
+                                     adaptor.getBase(), adaptor.getIndices());
+
+    // Note: We give participants as the number of arrivals that have to occur
+    // before the phase changes. Hardware changes the phase when the count
+    // actually wraps around, so we subtract 1 to get the behavior we're looking
----------------
PMylon wrote:

nit: maybe it would be better if we explicitly state that the phase changes by hardware when underflow is detected (pending count becomes negative). Because "Hardware changes the phase when the count actually wraps around" could be interpreted as "becomes zero".

And then we can state that we make the adjustment here because the provided number of arrivals (participants) assumes that phase changes when pending count reaches zero.

https://github.com/llvm/llvm-project/pull/180112


More information about the Mlir-commits mailing list