[Mlir-commits] [mlir] [vector][mlir] Canonicalize to shape_cast where possible (PR #140583)

Fri Nov 7 09:13:29 PST 2025

banach-space wrote:

Thanks @MaheshRavishankar , as promised I am returning to this after you've shared your example.

> see this post here https://discourse.llvm.org/t/rfc-update-to-general-design-section-of-operation-canonicalizations-in-mlir/79355?u=maheshravishankar . This talks about the how vector.transpose captures more information than a vector.shape_cast and how you cannot always go from shape_cast to transpose.

I've extracted this repro as something representative (*):
```mlir
func.func @transpose_to_shape_cast_1(%0 : vector<4x1x1xf32>) -> vector<1x4x1xf32> {
  %res = vector.transpose %0, [2, 0, 1] : vector<4x1x1xf32> to vector<1x4x1xf32>
  return %res : vector<1x4x1xf32>
}

// -----

func.func @transpose_to_shape_cast_2(%0 : vector<4x1x1xf32>) -> vector<1x4x1xf32> {
  %res = vector.transpose %0, [1, 0, 2] : vector<4x1x1xf32> to vector<1x4x1xf32>
  return %res : vector<1x4x1xf32>
}
```

**QUESTION/COMMENT:** 

Aren't the examples above _identical_ operations?

**YES - LLVM example!**
Lets try these:
```bash
# Canonicalize to vector.shape_cast, then lower.
$ mlir-opt  repro.mlir -canonicalize -test-lower-to-llvm --split-input-file
# Lower as vector.transpose.
$ mlir-opt  repro.mlir -test-lower-to-llvm --split-input-file
```

In both cases I get the following (testing using this PR):

```mlir
module {
  llvm.func @transpose_to_shape_cast_1(%arg0: !llvm.array<4 x array<1 x vector<1xf32>>>) -> !llvm.array<1 x array<4 x vector<1xf32>>> {
    %0 = llvm.mlir.poison : !llvm.array<1 x array<4 x vector<1xf32>>>
    %1 = llvm.extractvalue %arg0[0, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %2 = llvm.insertvalue %1, %0[0, 0] : !llvm.array<1 x array<4 x vector<1xf32>>>
    %3 = llvm.extractvalue %arg0[1, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %4 = llvm.insertvalue %3, %2[0, 1] : !llvm.array<1 x array<4 x vector<1xf32>>>
    %5 = llvm.extractvalue %arg0[2, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %6 = llvm.insertvalue %5, %4[0, 2] : !llvm.array<1 x array<4 x vector<1xf32>>>
    %7 = llvm.extractvalue %arg0[3, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %8 = llvm.insertvalue %7, %6[0, 3] : !llvm.array<1 x array<4 x vector<1xf32>>>
    llvm.return %res : !llvm.array<1 x array<4 x vector<1xf32>>>
  }
}

// -----
module {
  llvm.func @transpose_to_shape_cast_2(%arg0: !llvm.array<4 x array<1 x vector<1xf32>>>) -> !llvm.array<1 x array<4 x vector<1xf32>>> {
    %0 = llvm.mlir.poison : !llvm.array<1 x array<4 x vector<1xf32>>>
    %1 = llvm.extractvalue %arg0[0, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %2 = llvm.insertvalue %1, %0[0, 0] : !llvm.array<1 x array<4 x vector<1xf32>>>
    %3 = llvm.extractvalue %arg0[1, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %4 = llvm.insertvalue %3, %2[0, 1] : !llvm.array<1 x array<4 x vector<1xf32>>>
    %5 = llvm.extractvalue %arg0[2, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %6 = llvm.insertvalue %5, %4[0, 2] : !llvm.array<1 x array<4 x vector<1xf32>>>
    %7 = llvm.extractvalue %arg0[3, 0] : !llvm.array<4 x array<1 x vector<1xf32>>>
    %res = llvm.insertvalue %7, %6[0, 3] : !llvm.array<1 x array<4 x vector<1xf32>>>
    llvm.return %res : !llvm.array<1 x array<4 x vector<1xf32>>>
  }
}
``` 

Note, `%res == %arg0`, which confirms that we are dealing with a NO-OP.

**YES - SPIR-V example!**
Lets try these:
```bash
# Canonicalize to vector.shape_cast, then lower.
$ mlir-opt  repro.mlir -canonicalize -test-convert-to-spirv --split-input-file
# Lower as vector.transpose.
$ mlir-opt  repro.mlir -test-convert-to-spirv --split-input-file
```
In both cases I get the following (testing using this PR):

```mlir
module {
  func.func @transpose_to_shape_cast_1(%arg0: vector<1xf32>, %arg1: vector<1xf32>, %arg2: vector<1xf32>, %arg3: vector<1xf32>) -> (vector<1xf32>, vector<1xf32>, vector<1xf32>, vector<1xf32>) {
    return %arg0, %arg1, %arg2, %arg3 : vector<1xf32>, vector<1xf32>, vector<1xf32>, vector<1xf32>
  }
}

// -----
module {
  func.func @transpose_to_shape_cast_2(%arg0: vector<1xf32>, %arg1: vector<1xf32>, %arg2: vector<1xf32>, %arg3: vector<1xf32>) -> (vector<1xf32>, vector<1xf32>, vector<1xf32>, vector<1xf32>) {
    return %arg0, %arg1, %arg2, %arg3 : vector<1xf32>, vector<1xf32>, vector<1xf32>, vector<1xf32>
  }
}
```

SPIR-V makes it even clearer that we are dealing with a NO-OP 😅 

**FINAL THOUGHTS**

I argue that in all cases we are dealing with one operation for which we have multiple names (`vector.transpose [2, 0, 1]` vs `vector.transpose [1, 0, 2]` vs `vector.shape_cast`). This discussion is merely trying to establish a single name for all of this.

I obviously might be missing something - please correct me know if that's the case. I am sharing this to make my mental model clear and to avoid confusion. 

-Andrzej

(*) Please provide other examples if this does not capture what you had in mind.

https://github.com/llvm/llvm-project/pull/140583