<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/58713>58713</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
The generic vectorization produces wrong IR for linalg.transpose op
</td>
</tr>
<tr>
<th>Labels</th>
<td>
mlir:linalg,
mlir
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
pifon2a
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
hanhanW
</td>
</tr>
</table>
<pre>
Input IR:
```mlir
module {
func.func @foo(%arg0: tensor<1x1x1x1x8x32xf32>, %arg1: tensor<1x1x32x8xf32>) -> tensor<1x1x32x8xf32> {
%0 = tensor.empty() : tensor<1x1x32x8xf32>
%extracted_slice = tensor.extract_slice %arg0[0, 0, 0, 0, 0, 0] [1, 1, 1, 1, 8, 32] [1, 1, 1, 1, 1, 1] : tensor<1x1x1x1x8x32xf32> to tensor<1x1x8x32xf32>
%transposed = linalg.transpose ins(%extracted_slice : tensor<1x1x8x32xf32>) outs(%0 : tensor<1x1x32x8xf32>) permutation = [0, 1, 3, 2]
return %transposed : tensor<1x1x32x8xf32>
}
}
```
IR after vectorization:
```mlir
module {
func.func @foo(%arg0: tensor<1x1x1x1x8x32xf32>, %arg1: tensor<1x1x32x8xf32>) -> tensor<1x1x32x8xf32> {
%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = tensor.empty() : tensor<1x1x32x8xf32>
%1 = vector.transfer_read %0[%c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<1x1x32x8xf32>, vector<1x1x32x8xf32>
%2 = vector.transpose %1, [0, 1, 3, 2] : vector<1x1x32x8xf32> to vector<1x1x8x32xf32>
%3 = vector.transpose %2, [0, 1, 3, 2] : vector<1x1x8x32xf32> to vector<1x1x32x8xf32>
%4 = vector.transfer_write %3, %0[%c0, %c0, %c0, %c0] {in_bounds = [true, true, true, true]} : vector<1x1x32x8xf32>, tensor<1x1x32x8xf32>
return %4 : tensor<1x1x32x8xf32>
}
}
```
The read from `%arg0` is gone.
It works well if rewriting it in a linalg.generic form. E.g.,
```mlir
#map0 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
#map1 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d3, d2)>
module {
func.func @simple_KCRS_to_KCRSsr2(%arg0: tensor<1x1x1x1x8x32xf32>, %arg1: tensor<1x1x32x8xf32>) -> tensor<1x1x32x8xf32> {
%0 = tensor.empty() : tensor<1x1x32x8xf32>
%extracted_slice = tensor.extract_slice %arg0[0, 0, 0, 0, 0, 0] [1, 1, 1, 1, 8, 32] [1, 1, 1, 1, 1, 1] : tensor<1x1x1x1x8x32xf32> to tensor<1x1x8x32xf32>
%1 = linalg.generic {indexing_maps = [#map0, #map1], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%extracted_slice : tensor<1x1x8x32xf32>) outs(%0 : tensor<1x1x32x8xf32>) {
^bb0(%in: f32, %out: f32):
linalg.yield %in : f32
} -> tensor<1x1x32x8xf32>
return %1 : tensor<1x1x32x8xf32>
}
}
```
The output looks good:
```mlir
module {
func.func @simple_KCRS_to_KCRSsr2(%arg0: tensor<1x1x1x1x8x32xf32>, %arg1: tensor<1x1x32x8xf32>) -> tensor<1x1x32x8xf32> {
%c0 = arith.constant 0 : index
%cst = arith.constant 0.000000e+00 : f32
%0 = tensor.empty() : tensor<1x1x32x8xf32>
%1 = vector.transfer_read %arg0[%c0, %c0, %c0, %c0, %c0, %c0], %cst {in_bounds = [true, true, true, true]} : tensor<1x1x1x1x8x32xf32>, vector<1x1x8x32xf32>
%2 = vector.transpose %1, [0, 1, 3, 2] : vector<1x1x8x32xf32> to vector<1x1x32x8xf32>
%3 = vector.transfer_write %2, %0[%c0, %c0, %c0, %c0] {in_bounds = [true, true, true, true]} : vector<1x1x32x8xf32>, tensor<1x1x32x8xf32>
return %3 : tensor<1x1x32x8xf32>
}
}
```
@pifon2a could you take a look at this since you recently added the op? Thank you!
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJztV0uPozgQ_jXkYg0CE_I4cOin1Jpb70h7jAw2xNvGRraZTvbXb9lAHhPS6ZmORrOjSZABu1yu51dFrug2e5JNa9HTc5DcBNF9EA3jLOquWnDdTdWKtoKhYH7bvSNUtrII3YCCaVQqFeBFgFOiqwi4IcukUTpI7uJN919sErwpExwkDwG-Qx1pfEIKRIsd2RJ9gvtZgkNpkOMYoSC578lDVjd264VaojePOeTANlaTwjK6MoIX7IhftzQs9Kqmt5FTZ3xI7xEQxO7lm2HhBjj-PEU_OIIL5kRWHa8fmvpQOZBfmkYZRr1egksiqnA3i7g0nRNPrXBz_gQwr2ptvzN629RA2zBdt5ZYrqSXYrCg1zdxg7PKXm7NbKvlifiX_RnMey77hyGsD2P96RmR0jKNvrLCKs3_9aL9JglRdBlBNLfrsFDSWCIt6pzEJWWbY2pjR8nDyP9YgG-jbq877rqJF3sOnQ-6kCyZXmlGqOcPUeLV6S01-gBR0z87Pea3XK5y1UpqhjizumWOZPQOu-cjuXZk_rtewAu64BNdfHo5Lb2EoxHvjz7H3qX40dq5FE_OHo2_5-hv0OVdWk9HPfgKseTPT3rnvNOVH3HgWXEd6cVg3APO9Jo482XNkA_nUqsaucUeG2YR4gZVSrLwCJcselX6xaBXJgTiJWx2tuSyQtxC8iIyAHjFJNO8QKXSdYgewioERS-gV4CTmjQ9OpQll2wF76Ak5C31jqA-RqgPG5rsoOf8-s4kHe_447yTgWrP-wLqGl43gq0-3z3_tbLK343GvwAQ_-lM3oSt-LAhGeLZIwDUKAh5F0A7FOhjt_OVD7Ue-gFpNIHMX9ltww7IcUM0EYIJeOx2ffdMBy4_pUE6Dp30Ic-jbiuXQ-ntohSY7iaWu4YF-V9vyi1nwhdQLkfqNmj0ZhSPYWJ8bUwELdw3iFDqxaGgoh9uvX5ZEPhdurEeXi5W8Z_Uoo2471290pXatB_plU7btKNeacjw_1GvlFwTFyCJG14qiQkqVAsItlUtsuSFuaYHgAIRi-wauibDJaCvW9WsYNKKLSKUwgeidcgCPccj-rIm8sWRBDiesCyezeYgbZJGE5oldJksycRyK1jmwGgoPUffgqjRADgFVJRXraD9gi9GaLVOP59VM2m1yNbWNsaBGH6Eq4LkbXNI3xpehPg63D4B03_gFHjlxrQMysNjupjHyWSd0Vmex9EippTQdD6fTxNaRDiKyJTNC0oXE0FyJkzWFTePi8lNJ86-evlpX7kmPMMRxnGUxDhO0zQKiwWdlTFOZyXOl4TkYG9WEy5CJ1modDXRmRcybysDi4Iba_aLxBheScZ6AXpX9YeR1q6VzsDqcP098cplXrP_AAhpHrg">