[Mlir-commits] [mlir] [mlir][Vector] add vector.insert canonicalization pattern to convert a chain of insertions to vector.from_elements (PR #142944)

Mon Aug 4 06:08:39 PDT 2025

================
@@ -3250,6 +3262,130 @@ class InsertSplatToSplat final : public OpRewritePattern<InsertOp> {
     return success();
   }
 };
+
+/// Pattern to optimize a chain of insertions.
+///
+/// This pattern identifies chains of vector.insert operations that:
+/// 1. Only insert values at static positions.
+/// 2. Completely initialize all elements in the resulting vector.
+/// 3. All intermediate insert operations have only one use.
+///
+/// When these conditions are met, the entire chain can be replaced with a
+/// single vector.from_elements operation.
+///
+/// Example transformation:
+///   %poison = ub.poison : vector<2xi32>
+///   %0 = vector.insert %c1, %poison[0] : i32 into vector<2xi32>
+///   %1 = vector.insert %c2, %0[1] : i32 into vector<2xi32>
+/// ->
+///   %result = vector.from_elements %c1, %c2 : vector<2xi32>
+class InsertChainFullyInitialized final : public OpRewritePattern<InsertOp> {
+public:
+  using OpRewritePattern::OpRewritePattern;
+  LogicalResult matchAndRewrite(InsertOp op,
+                                PatternRewriter &rewriter) const override {
+
+    VectorType destTy = op.getDestVectorType();
+    if (destTy.isScalable())
+      return failure();
+    // This pattern has linear time complexity with respect to the length of the
+    // insert chain. So we only care about the last insert op which has the
+    // highest probability of success.
+    for (Operation *user : op.getResult().getUsers())
+      if (auto insertOp = dyn_cast<InsertOp>(user))
+        if (insertOp.getDest() == op.getResult())
+          return failure();
----------------
banach-space wrote:

Initially, I found this block a bit confusing - below is my current understanding for confirmation.

---

To identify a valid chain of `vector.insert` operations, the pattern first needs to locate the trailing `vector.insert` op. Consider the following two example chains:

1. insert -> insert -> insert -> **insert**
2. insert -> insert -> **insert** -> insert

Only Option 1 qualifies under the current design. That makes sense to me - it’s a conscious design choice that simplifies the pattern logic. Supporting Option 2 would be possible, but would likely add significant complexity.

This code block ensures that the given op is the trailing `vector.insert` in a chain that matches this pattern.

If that's accurate, I’d suggest updating the in-code comment to something like:

> Ensure this is the trailing vector.insert op in a chain of inserts.

I'd also recommend adding a note about this constraint in the high-level comment for the pattern.

---

As for this comment ...

>    // This pattern has linear time complexity with respect to the length of the
>    // insert chain. So we only care about the last insert op which has the
>    // highest probability of success.

IMHO, you want to avoid matching overly complex or fragmented insert chains, and focusing on the last op is a clean and efficient approach. That's the design and that's fine. To me, everything else is secondary and can be skipped.

I mostly want to avoid our "future selves" getting into a discussion on "probability" and "cost" 😅 

https://github.com/llvm/llvm-project/pull/142944