[Mlir-commits] [mlir] [mlir][Vector] Tighten up application conditions in TransferReadAfter… (PR #143869)

Thu Jun 12 05:00:12 PDT 2025

================
@@ -4684,15 +4687,27 @@ struct TransferReadAfterWriteToBroadcast
     if (getUnusedDimsBitVector({readOp.getPermutationMap()}) !=
         getUnusedDimsBitVector({defWrite.getPermutationMap()}))
       return failure();
-    if (readOp.getIndices() != defWrite.getIndices() ||
-        readOp.getMask() != defWrite.getMask())
+    // This pattern should only catch the broadcast case, the non-broadcast case
+    // should be done separately to keep application conditions clean and
+    // separate.
----------------
nicolasvasilache wrote:

The whole notion of creating transposes as part of canonicalization if very iffy..
There are other patterns that move permutation logic:
- either into the transfer, or
- outside of the transfer
Ideally, this would be target-dependent (i.e. does your DMA HW allow transpose on the fly or not).
I'd be in favor of splitting out permutation handling from the canonicalization but this will likely be too intrusive.

Avoiding handling the untested (and previously wrong), no-broadcast yes-transpose case in this pattern seems reasonable to me (but may impact downstream users silently relying on this).

https://github.com/llvm/llvm-project/pull/143869