[Mlir-commits] [mlir] [MLIR][XeGPU] Add 2D `vector.multi_reduction` optimization (PR #171154)
Jianhui Li
llvmlistbot at llvm.org
Tue Dec 9 08:27:29 PST 2025
================
@@ -416,12 +416,131 @@ class VectorExtractOpPattern final
}
};
+class MultiRed2dOp : public OpConversionPattern<vector::MultiDimReductionOp> {
+ using OpConversionPattern::OpConversionPattern;
+ LogicalResult
+ matchAndRewrite(vector::MultiDimReductionOp reductionOp, OpAdaptor adaptor,
+ ConversionPatternRewriter &rewriter) const override {
+ if (reductionOp.getReductionDims().size() != 2)
+ return rewriter.notifyMatchFailure(reductionOp,
+ "Expected 2D multi reduction");
+
+ auto layout = xegpu::getDistributeLayoutAttr(reductionOp.getResult());
+
+ auto dims = llvm::to_vector(reductionOp.getReductionDims());
+ auto [intraLaneDim, crossLaneDim] = getReductionDimOrder(dims, layout);
+ // Order does not matter
+ if (intraLaneDim == -1 || crossLaneDim == -1) {
+ intraLaneDim = dims[0];
+ crossLaneDim = dims[1];
+ }
+ auto loc = reductionOp.getLoc();
+ // XeGPU transforms expect vector types
+ auto sourceVecType = reductionOp.getSourceVectorType();
+ auto acc = reductionOp.getAcc();
+ bool scalarAcc = !isa<VectorType>(acc.getType());
+ if (scalarAcc)
+ acc = vector::FromElementsOp::create(
+ rewriter, loc, VectorType::get({1}, sourceVecType.getElementType()),
+ acc);
+
+ // Preserve layout in the intermediate reduction (apart from the reduced
+ // dim)
+ auto sourceSliceLayoutAttr = cast<xegpu::SliceAttr>(layout);
+ SmallVector<int64_t> sliceDims{
+ sourceSliceLayoutAttr.getDims().asArrayRef()};
+ auto foundIt = std::find(sliceDims.begin(), sliceDims.end(), crossLaneDim);
+ assert(foundIt != sliceDims.end() &&
+ "Expected to find reduction dim in slice dims");
+ sliceDims.erase(foundIt);
+ auto intraLaneLayout = xegpu::SliceAttr::get(
+ reductionOp.getContext(), sourceSliceLayoutAttr.getParent(),
+ DenseI64ArrayAttr::get(getContext(), sliceDims));
+
+ // First we do intra-lane reduction
----------------
Jianhui-Li wrote:
I think that this PR should handle the wg level, so that we can functionally support mutli-dim reduction case first.
We could use a heuristic to separate the reducion following the order: intra-lane, inter-lane, inter-sg.
Say, the intra-lane is -2 dim, and inter-lane is -1 dim, the inter-sg depends on sg layout but should dims to the left of -2, like (-2, -3, ....). if the reduction is cross multiple sg layout dims, then we may pick the order to let the rightmost dims being reduced first (smaller strides), so -2 first then -3.
The dimension order needs to be adjusted using the order attributes. Say if the order attribute transpose dims (-2, -1) to (-1, -2), then the lane ownership also transposed, so the above heuristic should apply to adjusted dimension order.
https://github.com/llvm/llvm-project/pull/171154
More information about the Mlir-commits
mailing list