<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/138265>138265</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[mlir][linalg] Simplify vectorization test output using `-canonicalize -cse`
</td>
</tr>
<tr>
<th>Labels</th>
<td>
mlir
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
banach-space
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
banach-space
</td>
</tr>
</table>
<pre>
The Linalg vectorization tests are currently quite complex and hard to navigate (see full list with links below). One area I’d like to improve is simplifying the expected test output by updating the `mlir-opt` invocation to include:
* `-canonicalize -cse`.
### Why add `-cse`?
CSE alone is a huge win. It eliminates redundant constants like:
```mlir
%c0 = arith.constant 0 : index
%c0_1 = arith.constant 0 : index
%c0_2 = arith.constant 0 : index
```
Without CSE, test updates often involve unnecessarily matching different SSA values representing the same constant, which adds noise and overhead.
### Why add `-canonicalize`?
Adding -canonicalize helps simplify `tensor.dim`, `affine.appl`y, and other commonly duplicated constructs.
**Current output from the vectorizer**
```mlir
func.func @test_masked_vectorize_dynamic_pad(%arg0: tensor<?x?xf32>, %arg1: index, %arg2: index) -> tensor<?x?xf32> {
%cst = arith.constant 4.243000e+01 : f32
%c0 = arith.constant 0 : index
%c0_0 = arith.constant 0 : index
%dim = tensor.dim %arg0, %c0_0 : tensor<?x?xf32>
%0 = affine.apply #map()[%arg1, %dim]
%c1 = arith.constant 1 : index
%dim_1 = tensor.dim %arg0, %c1 : tensor<?x?xf32>
%1 = affine.apply #map()[%arg2, %dim_1]
%c0_2 = arith.constant 0 : index
%c0_3 = arith.constant 0 : index
%dim_4 = tensor.dim %arg0, %c0_3 : tensor<?x?xf32>
%c1_5 = arith.constant 1 : index
%dim_6 = tensor.dim %arg0, %c1_5 : tensor<?x?xf32>
%2 = vector.create_mask %dim_4, %dim_6 : vector<2x4xi1>
%3 = vector.mask %2 { vector.transfer_read %arg0[%c0_2, %c0_2], %cst {in_bounds = [true, true]} : tensor<?x?xf32>, vector<2x4xf32> } : vector<2x4xi1> -> vector<2x4xf32>
%4 = tensor.empty(%0, %1) : tensor<?x?xf32>
%c0_7 = arith.constant 0 : index
%c0_8 = arith.constant 0 : index
%dim_9 = tensor.dim %4, %c0_8 : tensor<?x?xf32>
%c1_10 = arith.constant 1 : index
%dim_11 = tensor.dim %4, %c1_10 : tensor<?x?xf32>
%5 = vector.create_mask %dim_9, %dim_11 : vector<2x4xi1>
%6 = vector.mask %5 { vector.transfer_write %3, %4[%c0_7, %c0_7] {in_bounds = [true, true]} : vector<2x4xf32>, tensor<?x?xf32> } : vector<2x4xi1> -> tensor<?x?xf32>
return %6 : tensor<?x?xf32>
}
```
There is a lot of duplication of `arith.constant` and `tensor.dim`.
**Output from the vectorizer after adding `-cse`:**
```mlir
func.func @test_masked_vectorize_dynamic_pad(%arg0: tensor<?x?xf32>, %arg1: index, %arg2: index) -> tensor<?x?xf32> {
%cst = arith.constant 4.243000e+01 : f32
%c0 = arith.constant 0 : index
%dim = tensor.dim %arg0, %c0 : tensor<?x?xf32>
%0 = affine.apply #map()[%arg1, %dim]
%c1 = arith.constant 1 : index
%dim_0 = tensor.dim %arg0, %c1 : tensor<?x?xf32>
%1 = affine.apply #map()[%arg2, %dim_0]
%2 = vector.create_mask %dim, %dim_0 : vector<2x4xi1>
%3 = vector.mask %2 { vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [true, true]} : tensor<?x?xf32>, vector<2x4xf32> } : vector<2x4xi1> -> vector<2x4xf32>
%4 = tensor.empty(%0, %1) : tensor<?x?xf32>
%dim_1 = tensor.dim %4, %c0 : tensor<?x?xf32>
%dim_2 = tensor.dim %4, %c1 : tensor<?x?xf32>
%5 = vector.create_mask %dim_1, %dim_2 : vector<2x4xi1>
%6 = vector.mask %5 { vector.transfer_write %3, %4[%c0, %c0] {in_bounds = [true, true]} : vector<2x4xf32>, tensor<?x?xf32> } : vector<2x4xi1> -> tensor<?x?xf32>
return %6 : tensor<?x?xf32>
}
```
No duplication of `arith.constant`, but `tensor.dim` is still unnecessarily duplicated.
**Output from the vectorizer after adding `-canonicalize -cse`:**
```mlir
func.func @test_masked_vectorize_dynamic_pad(%arg0: tensor<?x?xf32>, %arg1: index, %arg2: index) -> tensor<?x?xf32> {
%c1 = arith.constant 1 : index
%cst = arith.constant 4.243000e+01 : f32
%c0 = arith.constant 0 : index
%dim = tensor.dim %arg0, %c0 : tensor<?x?xf32>
%0 = affine.apply #map()[%arg1, %dim]
%dim_0 = tensor.dim %arg0, %c1 : tensor<?x?xf32>
%1 = affine.apply #map()[%arg2, %dim_0]
%2 = vector.create_mask %dim, %dim_0 : vector<2x4xi1>
%3 = vector.mask %2 { vector.transfer_read %arg0[%c0, %c0], %cst {in_bounds = [true, true]} : tensor<?x?xf32>, vector<2x4xf32> } : vector<2x4xi1> -> vector<2x4xf32>
%4 = tensor.empty(%0, %1) : tensor<?x?xf32>
%5 = vector.create_mask %0, %1 : vector<2x4xi1>
%6 = vector.mask %5 { vector.transfer_write %3, %4[%c0, %c0] {in_bounds = [true, true]} : vector<2x4xf32>, tensor<?x?xf32> } : vector<2x4xi1> -> tensor<?x?xf32>
return %6 : tensor<?x?xf32>
}
```
No duplication :)
### Pros vs Cons
Pros:
* Easier to focus on the semantic intent of vectorization output.
* Reduces test maintenance (less duplication, fewer fragile SSA names).
* Aligns with [FileCheck best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices): test the minimal necessary.
Cons:
* Tests will now depend on CSE and canonicalization, making them indirectly sensitive to unrelated changes.
* Tests will no longer isolate vectorization alone - they will validate a pipeline of transformations.
### Next steps
While there are trade-offs, I believe this change will be beneficial overall.
My first patch is here:
* TODO
Assuming there are no strong objections, I’d like to use this issue for discussion and long-term context.
CC @dcaballe @hanhanW - you've reviewed most of my patches in this area. Anyone else I should include?
Thanks!
### List of test files
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorization-pad-patterns.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorization-scalable.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorization-unsupported.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorization.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorize-conv-masked-and-scalable.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorize-convolution-flatten.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorize-convolution.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorize-tensor-extract-masked.mlir
* https://github.com/llvm/llvm-project/blob/main/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzsWt1v2zgS_2uYl4ENibL88eAHx64PBfa2h2uBPhoUObJ4pUgdSTnx_vUHUvJX1k6dw-5m77aAmzQ0OZz5zW8-RIo5J7cacU7yR0JpwTTj1cA1jCOhlOSrB9b6ytj5-TcPhRH7-ZcK4SepmdrCDrk3Vv7CvDQaPDrvgFkE3lqL2qs9_LuVHoGbulH4DEwLqJgV4A1otpNb5hEInTpEKFulQEnn4Un6CpTU3xwUqMwTobMhfNIYRDP4SD5QMk3IbCZAyW8YZMm6sWaHIB04WTdKlnupt-ArBHxukHsUUTswrW9aD8Ue2kYwf5hExkmtpB2YxpNxAlLvDO9tMiA1V61Aki1IsiB0EWYPONNGS86U_AVhwB2ScTIM34cpWfeBr9UemBDdgjiFZGuSLJafPwBTRkeFGVTtFuFJ6iF89IBK1lIzjw4silYLpj1wo51n2rtoca_JOOk-QfO4bc4TINkKmJW-Gh7WAITRBUgt8Pkwb5PePZPeM_OgC0kWX6WvTOth-fkDocsO9gg2OjClRx3hVTuEVmvk6ByzUu2hZp5XwSFCliUG-sDnzwvYMdVGLBqLDvXRZY7VeMQlbPRUSV4FuB1oIx1Gspkd2gqZeN01Z748-ogki4UQYbdLX1eomhPLwnqP2hk7FLIOi-kyjLGylBqHrGkUGSf7MBrV8RXaEA210WoPom2U5CywM1piW-7dUdXwWXaBdOBtaU0drT8EHtpu3hU-AJSt5sPwA8goCX7Y1Mx9Q7E5rt6IvWa15JuGCUKnhObMbpPg2c4oki1Jtn4O_8qMkix6tJuVnvx_HKNnYzMYkOzDLTlAJo9RRwhLufPXSDYa0lGWJAkS-pikkXBh9WnZVbpfMvM4dXP_ZCHrOPfkWDgg05naS7uN0lFUv-mJDXsgNKtZE9GexeTbodlJDiTKV2d6X43T9IbWfVTf1ju9T-v0Tq3pSetNeqn3jbxxyznZW5yzGX3XPdl9hvJ0k78J4PH3AI7y7ti5Q6cLxCG3yDzG6DyaeAbtOIrs5pJsSZ9HzzK9kJadSzuIoSHIDoPeMu1KtBuLTBy1jo4MvjohF2r_4a8QlpNHqTeFabVwcROSP3rbYkzt4Xe-IpPVqzbT5aXuxwywumFXlzmurDkZfMEArBu_79LXwRFpSED3USDZTN7C1OmbmDq7wpfRCezp3TRNr6av24ngWiYYnVia3pm_8u_QdHaeAdLv8nR8jaf5VZ4-WRl7wzzrtxgd2To5ATgh-eotHL1Gqtik3KpSr3P0dfws-tbqg92vo00mqxd9FEkWXyq0fY-ojAdTHjuG0JqaMnYaF4wI3WtoNF62JRdNxaebzQSw0oefXeNz3rYufvQZv1Hr8KdsHJL3axySC72_UxfP1_3eVfHosb9YRbzVSI7eSOEgh75ahn6bGnQWAnG_P6AEnVPj_7f8_GzuqTfBgKL1vyo58TDGS6VePOefHnr_-6J0_kAO12rU_3aZuj93_4Uq2o8y9aNM3VsWjtJ-lIPfrRzEfDt7ea76D2sc7BwsjXbdd2HkdHb-gTmJFryB0vDWgdHdaS7WTHvJQWofzzvLF3cL3QHosJfyTxQtR9cdLtcsLmKax7sEhc6dKxoQLfEJLZSWbaXCeKysWY2O0NlB4kLJrXbd1QPJH9dS4bJC_g2KsEVjGfeSo4sxNq28b6JNdE3oOhSYoVK7emjsltD1Fr2XertxnlmPgtD1F3Rh5G-tFBjXZKVUyIP8QZA_OMmns843zkdcaqllzRQcCui-r5kR3iOoX-LFy1Ootdo8gcAGtQjQxosGLeCsYB4xqdm3_iy9DjlfWuRe7cGhdtLLXbxXabVF1Z1PV0xv0Q2v7QjK6C1akM6EyS881110DMJG-27FjikpwkQGjWxQSY3B4V2gGVvHde5XR_Y_47MH57HpifW1Cs708RmZWQzrBQ5MWbpg3kcoUEkMdlTS9fp3-xcIBWosJZdMxSsCplS_3d_3UEobXM48r0IPE-T3UPe2f1p96i8InGvrHsReCW3AeWv0FkzxL-TRkqjOlVur1vXKSedahNJYENLx1rmImxYR2IFHWwM32uNzDIDlMjQwgrOCKYXh_xXTFdNfYQB70xI62SFY3El8QgG1cTGc6n1nEjqQutuWWWRDWOh9cBAqh_ARXGVaJU53XuvDSQTT3xyh6Uun_CQ78ZGxgdWuR-kyRrbSV20x5KYmdB1ipf81aKwJMBG6LpQpQjQxqfugInQdxBK6XkmmulndxSOh6wuSDRomBg3zHq12w-OF2B-theNMsULh-2nQatc2jQl55_2UCEn0_b3xHhvjgBu9G3TPFQOmxbtSotPGqDZ6pVTBI-8IS6_I-yjQdTsDfPah1PYe-jOoEnV4EPNMzLIZe8B5OhmN01kyptOHaj4dj5NRKuio5HwisqIoaTKbZOmUj_IUx5MHOacJzZM8oSnNJ8l0OJ0hivFsxjinxWg8JqMEaybVsUN5iNVmnmZTOs4fFCtQuf4tkM6I-PaHnUdzi3bryChR0nl3kuClV_HNkbggX5H8UXV25iv4fLga__XLIYdr7NZdfZLvH-QfWqvmb3ZHtMoRuu4N283pfwIAAP__bPiSQg">