<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/116197>116197</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[linalg] Vectorization failure - masked "scalar read followed by broadcast"
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
banach-space
</td>
</tr>
</table>
<pre>
**REPRODUCER**
```mlir
func.func @vectorization_test(%extracted_slice : tensor<1x1x3xi32>, %arg0: index, %arg2: index, %3: tensor<2x4xi32>, %4: tensor<1x3x2x4xi32>) -> tensor<1x1x3xi32>{
%c0 = arith.constant 0 :index
%8 = linalg.generic {
indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>],
iterator_types = ["parallel", "parallel", "parallel"]}
outs(%extracted_slice : tensor<1x1x3xi32>) {
^bb0(%out: i32):
%9 = linalg.index 0 : index
%extracted = tensor.extract %3[%9, %c0] : tensor<2x4xi32>
%14 = arith.index_cast %extracted : i32 to index
%extracted_2 = tensor.extract %4[%c0, %14, %14, %14] : tensor<1x3x2x4xi32>
linalg.yield %extracted_2 : i32
} -> tensor<1x1x3xi32>
return %8 : tensor<1x1x3xi32>
}
module attributes {transform.with_named_sequence} {
transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
%0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
%1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
// %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
transform.structured.vectorize %0 vector_sizes [1, 1, 4] {vectorize_nd_extract} : !transform.any_op
transform.yield
}
}
```
**ERROR LOG**
```bash
../file.mlir:10:18: error: 'vector.mask' op expects a 'vector<i1>' mask for the maskable operation
%extracted = tensor.extract %3[%9, %c0] : tensor<2x4xi32>
^
../file.mlir:10:18: note: see current operation:
%17 = "vector.mask"(%6) ({
%34 = "vector.transfer_read"(%arg3, %15, %0, %16) <{in_bounds = [true, true, true], operandSegmentSizes = array<i32: 1, 2, 1, 0>, permutation_map = affine_map<(d0, d1) -> (0, 0, 0)>}> : (tensor<2x4xi32>, index, index, i32) -> vector<1x1x4xi32>
"vector.yield"(%34) : (vector<1x1x4xi32>) -> ()
}) : (vector<1x1x4xi1>) -> vector<1x1x4xi32>
```
**ANALYSIS**
This is a masked vectorization of `tensor.extract` where the Op that fails is effectively a scalar read + broadcast.
```mlir
// Op that fails vectorization
%extracted = tensor.extract %3[%9, %c0] : tensor<2x4xi32>
```
The `vectorizeAsTensorExtract` generates this valid `vector.transfer_read`:
```mlir
%34 = "vector.transfer_read"(%arg3, %15, %0, %16) <{in_bounds = [true, true, true], operandSegmentSizes = array<i32: 1, 2, 1, 0>, permutation_map = affine_map<(d0, d1) -> (0, 0, 0)>}> : (tensor<2x4xi32>, index, index, i32) -> vector<1x1x4xi32>
```
Linalg vectorizer, when creating `vector.mask`, generates a mask based on static loop sizes and input vector sizes. This gives:
* `vector<1x1x4xi1>`.
However, [inferTransferOpMaskType](https://github.com/llvm/llvm-project/blob/d119d43e92333966125755353f4e6227dd2c70da/mlir/lib/Dialect/Vector/IR/VectorOps.cpp#L4118-L4129) has no access to that information and instead looks at the permutation map of the masked op, `vector.transfer_read`. And that yields `vector<i1>`.
**SOLUTION**
TODO
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWF9v47gR_zT0C7GCSEqW_eAHJ47bBdJzkeQO6JNASSObXZlUSSqJ79MXJG398dq7OFzvrUAgW-Rw5seZ-c2Mw40RewmwQukDSjcz3tmD0quCS14evpiWlzArVHVaIbpGdP3y9M-X3ebXx6eX8I7iDYrXaB6Hv2MjdFiqO1lG7oFREr9DaZUWv3MrlMwtGIvoAtEUPq3mpYUqN40oASO2xhakURqxR_JJPtmnYBSxJ0QfMaIp1_vYyQhZweewRq_X2EQR_UymapIrO-xzLLLEXxB7uoMjezjfmKZljBHbYK6FPUSlksZyabFbXAcsQfAivvDSjZC82Ud7kKBFiXt9OOAXcp8feWu8LEofeF0LCW4JsUdEF1XsrlAR_6Q91ls7Dm26QfSxN2BBc6t0bk8t9BYQpS3XvGmgQZQGB_1kJd2gbHPWqjpr_nAsl-N7o_SpKOKgQ3XWh5KFC6wvMthFbTl2oPdWcDYeefss2mPxRwKK6LwY8sNdPF2e86GMUbrB93JmrJgko5h7u3nJjb226a-ArZpAm_iI3kGWBGRlfIZGkhtfrsFeZXAP-Oyrk4Cm-t588PMlCtnmR2k_SmWMNdhOS3zO6XtRDkcuiXJUVdcA5tZqUXTW5V_2YDWXplb6GH0Ie8glP7rsgf90IEtwiEZpMshOxVx5yfN-Nz9yIUMycb0nDh6iZDjM5SlX7dS4Bl4p2Zwc2Glu-pgHmg_ixuqutJ2GKjpyWx6wao0747k05XfPFizkuVYRHDAtbsAa8fn7vUkaXkHag81brkFafzmPOXsQRjXcRbvW6pjzQr0Hp_5p-1tEt84Kve-ZS8WHvDyIptIgcy6rnLdtc8pbbi1oacJVHFQ8yMsq7_nwZ9H-GFpwVHjNjfjdJWX64CuofwSiZQ-3sA3QfuCpYcdT8FIHLpwYvlza57RluL-nl5fdC37e_e1muy24OYSlKEJ0W4sGIt-C2Zq4RkkWDiNo7ejpwGbhLtGRm2-IZli1GD5bKK3BfNhG7FEQX6kz7CRxrTS2B_AvvGgAq9b1EqHkjdr2v6u5KH36-e2ksuA-DQAuO-1IMILHhgZMstDzKJ04gYZqMffUp4txZ6IpS67OhJCCzl3R6A9zvWeX8pyev_QVPGhmj46RMi9UJ6u--1rdgZObfPquHe4gq1fYH0Ha15CevvdofnIRYn7s8ZlK-5yNz1NOC_rY2TBvHXkbTt4dJoYxIg5KwsPPENnGbwUa3pmp-tlr-MKG6aRPKtcfrkPcezZQ5OJRlgSveau3FYxQO6g9pe6fI-NjP0D1U1auf1k__-v16-uElP75dhAGC8cml11Q4cnoi1WN0TyekgPNY_xxAA2eYrsW2wO3uOai8ZqgrqG04h2aE-bYlLzhGrvsw4g-4EIrXrkRJLo7iY9q9lT5BNpfSeSbnnw7gPNFX1zX5s2ffxq84rsod8OCdW59542ohjNXXJzHA9lvOOH_ZP7LyXwzzM9-IBr6u3YKPw4gcamBWyH3o4j6ijz3iIfYByrhghuosJLYOE-UuFGqxaFrc1lhIdvOns2E5Qh7Mu7FO5hRH1gP9q4qwzyOxsj_rj7gPeBF6YOQNei3c8Ls2n9w8-3t1Ib4Lg7Wtt6Gp9le2ENXRKU6IrptmvfLx5dWq39DaRHdFo0qEN1WhCyrhMGSMsaW8zmhaZamLGV1AnNKs6qiZRZXHNGtT2W6bYQ7txG8CYp-Cxeh268v_duuNVHZtoiy54SQxZfnhNClC-CBGywV5mUJxrifJ74YCOlGlFCfgieNdeWlUeqbwdz6sjRKQexSUNX9QOCi0nov3WVmhNeyCtZ8mTeTGPTex99X2tfd869vX3e_jCvt226zm1UrVi3Zks9gRTJG0mxJWDo7rBgsyIJVLM7SepmUdcmypMgKmtSLLJkny5lY0ZgmhJCExISSNCKQlQs6jwmjJF0AoCSGIxdN5GIWKb2fCWM6WBEyJ8ts1vACGrMK476ED-x3w6Q_0ysf6KLbG5TEjTDWDGqssI3_P0v4keDK5m-T7uDKcqcBf7n4FVE6Lvi1ahr1ARUuTkPhR5TOOt2s_nAOetwG0e35Yu8r-t8AAAD__zLCT5w">