[Mlir-commits] [mlir] [mlir][gpu] Add `subgroup_broadcast` op (PR #152808)

Fri Aug 15 08:51:57 PDT 2025

================
@@ -160,6 +160,27 @@ struct GPUSubgroupSizeOpToROCDL : ConvertOpToLLVMPattern<gpu::SubgroupSizeOp> {
   const amdgpu::Chipset chipset;
 };
 
+struct GPUBroadcastLaneOpToROCDL
+    : public ConvertOpToLLVMPattern<gpu::BroadcastLaneOp> {
+  using ConvertOpToLLVMPattern::ConvertOpToLLVMPattern;
+
+  LogicalResult
+  matchAndRewrite(gpu::BroadcastLaneOp op, OpAdaptor adaptor,
+                  ConversionPatternRewriter &rewriter) const override {
+    Value src = adaptor.getSrc();
+    if (adaptor.getBroadcastType() == gpu::BroadcastType::lane) {
+      rewriter.replaceOpWithNewOp<ROCDL::ReadlaneOp>(op, src.getType(), src,
+                                                     adaptor.getLane());
+    } else { // first_lane or any_lane
+      // any_lane is lowered to readfirstlane too, to force value into scalar
+      // register.
+      rewriter.replaceOpWithNewOp<ROCDL::ReadfirstlaneOp>(op, src.getType(),
----------------
Jianhui-Li wrote:

The test shows the LICM optimization can utilize the speculatability property. 
But I can't figure out how high level dialects can use the broadcast any_lane op. Particularly, does the user needs to know the target-specific implementation to use this operation effectively?  Is the usage related to broadcast, or just a hint? 


https://github.com/llvm/llvm-project/pull/152808