[Mlir-commits] [mlir] [mlir][gpu] Warp execute terminator getter (PR #154729)

Adam Siemieniuk llvmlistbot at llvm.org
Fri Aug 22 04:10:21 PDT 2025


================
@@ -2511,6 +2510,10 @@ bool WarpExecuteOnLane0Op::areTypesCompatible(Type lhs, Type rhs) {
       verifyDistributedType(lhs, rhs, getWarpSize(), getOperation()));
 }
 
+gpu::YieldOp WarpExecuteOnLane0Op::getTerminator() {
+  return cast<gpu::YieldOp>(getBody()->getTerminator());
----------------
adam-smnk wrote:

Warp executes's `SingleBlockImplicitTerminator<"gpu::YieldOp">` trait ensure that there's only one block and must be terminated by the specific terminator op.
In textual representation like:
```mlir
gpu.warp_execute_on_lane_0(%laneid)[32] {
    %c0 = arith.constant 0 : index
    %v = "test.dummy_op"() : () -> (vector<4xf32>)
    %v1 = "test.dummy_op"() : () -> (vector<4x1xf32>)
    vector.transfer_write %v1, %arg1[%c0, %c0] : vector<4x1xf32>, memref<1024x1024xf32>
    vector.transfer_write %v, %arg1[%c0, %c0] : vector<4xf32>, memref<1024x1024xf32>
  }
```
the terminator still exists but warp's printer omits it (see `WarpExecuteOnLane0Op::print`).

`getBody()` is an API provided by the `SingleBlock` trait and under the hood it gets the block in the same way as the more verbose version: op -> region -> block. It should be identical.

https://github.com/llvm/llvm-project/pull/154729


More information about the Mlir-commits mailing list