[Mlir-commits] [mlir] [mlir][tensor] Loosen restrictions on folding dynamic reshapes (PR #137963)

Tue May 6 08:05:34 PDT 2025

================
@@ -31,59 +31,70 @@ mlir::getReassociationIndicesForReshape(ShapedType sourceType,
 std::optional<SmallVector<ReassociationIndices>>
 mlir::getReassociationIndicesForCollapse(ArrayRef<int64_t> sourceShape,
                                          ArrayRef<int64_t> targetShape) {
-  if (sourceShape.size() <= targetShape.size())
+  unsigned numSourceDims = sourceShape.size(),
+           numTargetDims = targetShape.size();
+  if (numSourceDims <= numTargetDims)
     return std::nullopt;
-  unsigned sourceDim = 0;
-  SmallVector<ReassociationIndices> reassociationMap;
-  reassociationMap.reserve(targetShape.size());
-
-  ReassociationIndices currIndices;
-  int64_t prodOfCollapsedDims = 1;
-  while (sourceDim < sourceShape.size()) {
-    unsigned targetDim = reassociationMap.size();
-    // If we have mapped all the target dimensions stop and handle the remaining
-    // tail of size-1 dimensions explicitly.
-    if (targetDim == targetShape.size())
-      break;
+  SmallVector<ReassociationIndices, 4> reassociationMap;
+  reassociationMap.reserve(numTargetDims);
 
+  unsigned sourceDim = 0, targetDim = 0;
+  for (; targetDim < numTargetDims; ++targetDim) {
     int64_t currTargetShape = targetShape[targetDim];
-    while (sourceDim < (sourceShape.size() - 1) &&
-           sourceShape[sourceDim] != ShapedType::kDynamic &&
-           prodOfCollapsedDims * sourceShape[sourceDim] < currTargetShape) {
+    ReassociationIndices currIndices;
+    // 1. Target dimension is dynamic. Source shape should contain at least
+    // one dynamic dimension.
+    if (currTargetShape == ShapedType::kDynamic) {
----------------
AGindinson wrote:

While looking into a more generalized algorithm, I've discovered some examples where the assumption doesn't apply:
> if the output contains no adjacent dynamic dims the reassociation should be uniquely inferable

`?x2x8x3x2x? into ?x48x?` would be a case where relieving the legality checks for mixed subshapes per my proposal makes the reassociation non-deterministic. Here, the original algorithm would attempt `[[0], [1, 2, 3], ...]` and then just freak out at `2x? -> ?`. A proper implementation of my version should instead determine that there's more than one valid reassociation for the static slice and early-exit based on that.
Then, on a valid example like `?x5x8x3x2x? into ?x48x?` I would have to retry the reassociation after the initial failure to map `5x8x3` onto `48`. And for an invalid reshape like `?x8x3x1x1x1x1x5x2x? into ?x48x?` that would take a while, because any initial analyses of denominators wouldn't detect the sequential pattern.

So even if I manage to early exit for a bulk of such cases, at the end of the day we still end up with `O(n^2)` complexity* for a bunch of valid reshapes. This isn't as scary as it sounds, because most of the subshape ranks we're likely to deal with are in single digits.

*omitting stuff like `?x2x4x?x3x5x?x...x? into ?x8x?x15x...`, because it could be divided-and-conquered by mapping the purely static slices between the source & target.

---

The bottom line: I'm not sure the general, abstract algorithm is actually worth it for tensor reshapes, and it might be better I take the other approach:
> abandoning the util usage in the ComposeExpandOfCollapseOp pattern, employing similar logic to ComposeCollapseOfExpandOp

https://github.com/llvm/llvm-project/pull/137963