[Mlir-commits] [mlir] [MLIR][XeGPU] Extend propagation and sg_to_lane distribution pass support broadcast with low rank and scalar source input (PR #170409)

Sun Dec 7 02:43:12 PST 2025

================
@@ -1424,6 +1428,145 @@ struct VectorMultiReductionDistribution : public gpu::WarpDistributionPattern {
   }
 };
 
+/// This pattern distributes the `vector.broadcast` operation across lanes in a
+/// warp. The pattern supports three use cases:
+///
+/// 1) Broadcast a low-rank vector to high-rank vector: The low-rank input
+/// vector
+///    must have a slice layout of the result. If the distributed source and
+///    target vector types are identical, this lowers to a no-op; otherwise, it
+///    remains a broadcast but operates on distributed vectors.
+///
+/// 2) Broadcast a same-rank vector with identical layouts for source and
+/// target:
+///    The source vector must have unit dimensions, and lane_layout must be unit
----------------
akroviakov wrote:

A somewhat confusing description. 
A unit dim idx in the source vector must also be unit in `lane_layout`?
So `<8x1>` means `lane_layout` is `[N, 1]`?
But in the example:

> ///   %1 = vector.shape_cast %0
///     {layout_result_0 = #xegpu.layout<lane_layout = [1, 32], lane_data = [1,
///      1]>}: vector<8xf32> to vector<8x1xf32>
///   %2 = vector.broadcast %1
///     {layout_result_0 = #xegpu.layout<lane_layout = [1, 32], lane_data = [1,
///     1]>}: vector<8x1xf32> to vector<8x32xf32>

The source vector of the broadcast is unit in dim 1, but the lane layout dim 1 is `32` instead of unit.


https://github.com/llvm/llvm-project/pull/170409