[Mlir-commits] [mlir] [MLIR][XeGPU] Add unroll patterns for XeGPU (1/N) (PR #137010)

Tue May 6 15:48:36 PDT 2025

================
@@ -14,11 +14,53 @@ class RewritePatternSet;
 
 namespace xegpu {
 
+/// Options to control the XeGPU unrolling. Its main purpose is to
+/// provide a way to customize the native shape of the operation.
+struct UnrollOptions {
+  using FilterConstraintFnType = std::function<LogicalResult(Operation *op)>;
+  /// Callback function that indicates whether vector unrolling should be
+  /// attempted on the operation.
+  FilterConstraintFnType filterConstraint = nullptr;
+  UnrollOptions &setFilterConstraint(FilterConstraintFnType constraint) {
+    filterConstraint = std::move(constraint);
+    return *this;
+  }
+
+  using NativeShapeFnType =
+      std::function<std::optional<SmallVector<int64_t>>(Operation *op)>;
+  /// Function that returns the shape to unroll to for a given operation.
+  /// The unrolling is aborted if the function returns `std::nullopt`.
+  NativeShapeFnType nativeShape = nullptr;
+  UnrollOptions &setNativeShapeFn(NativeShapeFnType fn) {
+    nativeShape = std::move(fn);
+    return *this;
+  }
+};
+
 /// Appends patterns for folding aliasing ops into XeGPU ops into `patterns`.
 void populateXeGPUFoldAliasOpsPatterns(RewritePatternSet &patterns);
 /// Appends patterns for XeGPU SIMT distribution into `patterns`.
 void populateXeGPUSubgroupDistributePatterns(RewritePatternSet &patterns);
 
+/// Collect a set of pattern to unroll xegpu operations to a smaller shapes.
+/// Users can control whether an operation to be unrolled or not, as well as
+/// the its target shape via `options` structure. (via setting filterConstraint
+/// and nativeShape respectively, both of them are function refs taking `op` as
+/// the input).
+/// An `op` is unrolled to the `targetShape` as follows, for each of its
+/// operands:
+///   1. the unrolled type `unrolledType` and number of unrolled instances
+///   `numUnrolledInstances` are computed from the `targetShape`.
+///   2. ExtractStridedSlice are created to break-up the vector operands. And
+///   BuildinUnrealizedCastop are created to break-up the TensorDesc operands.
+///   3. the original op is cloned `numUnrolledInstances` times, once for each
+///   result.
+///   4. InsertStridedSlice are inserted for VectorType result, and
----------------
Garra1980 wrote:

in 2. and 4. you mean functionality of functions named 'pack' and 'unpack'?

https://github.com/llvm/llvm-project/pull/137010