<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/116197>116197</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [linalg] Vectorization failure - masked "scalar read followed by broadcast"
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          banach-space
      </td>
    </tr>
</table>

<pre>
    **REPRODUCER**
```mlir
func.func @vectorization_test(%extracted_slice : tensor<1x1x3xi32>, %arg0: index, %arg2: index, %3: tensor<2x4xi32>, %4: tensor<1x3x2x4xi32>) -> tensor<1x1x3xi32>{
%c0 = arith.constant 0 :index

%8 = linalg.generic {
  indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d1, d2)>],
  iterator_types = ["parallel", "parallel", "parallel"]}
 outs(%extracted_slice : tensor<1x1x3xi32>) {
  ^bb0(%out: i32):
    %9 = linalg.index 0 : index
    %extracted = tensor.extract %3[%9, %c0] : tensor<2x4xi32>
    %14 = arith.index_cast %extracted : i32 to index
 %extracted_2 = tensor.extract %4[%c0, %14, %14, %14] : tensor<1x3x2x4xi32>
    linalg.yield %extracted_2 : i32
  } -> tensor<1x1x3xi32>

  return %8 : tensor<1x1x3xi32>
}
module attributes {transform.with_named_sequence} {
  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
    // %2 = transform.structured.vectorize_children_and_apply_patterns %1  { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
 transform.structured.vectorize %0 vector_sizes [1, 1, 4] {vectorize_nd_extract} : !transform.any_op
    transform.yield
 }
}
```

**ERROR LOG**
```bash
../file.mlir:10:18: error: 'vector.mask' op expects a 'vector<i1>' mask for the maskable operation
 %extracted = tensor.extract %3[%9, %c0] : tensor<2x4xi32>
 ^
../file.mlir:10:18: note: see current operation:
%17 = "vector.mask"(%6) ({
  %34 = "vector.transfer_read"(%arg3, %15, %0, %16) <{in_bounds = [true, true, true], operandSegmentSizes = array<i32: 1, 2, 1, 0>, permutation_map = affine_map<(d0, d1) -> (0, 0, 0)>}> : (tensor<2x4xi32>, index, index, i32) -> vector<1x1x4xi32>
 "vector.yield"(%34) : (vector<1x1x4xi32>) -> ()
}) : (vector<1x1x4xi1>) -> vector<1x1x4xi32>

```

**ANALYSIS**

This is a masked vectorization of `tensor.extract` where the Op that fails is effectively a scalar read + broadcast.
```mlir
    // Op that fails vectorization
 %extracted = tensor.extract %3[%9, %c0] : tensor<2x4xi32>
```

The `vectorizeAsTensorExtract` generates this valid `vector.transfer_read`:
```mlir
 %34 = "vector.transfer_read"(%arg3, %15, %0, %16) <{in_bounds = [true, true, true], operandSegmentSizes = array<i32: 1, 2, 1, 0>, permutation_map = affine_map<(d0, d1) -> (0, 0, 0)>}> : (tensor<2x4xi32>, index, index, i32) -> vector<1x1x4xi32>
```

Linalg vectorizer, when creating `vector.mask`, generates a mask based on static loop sizes and input vector sizes. This gives:
* `vector<1x1x4xi1>`.

However, [inferTransferOpMaskType](https://github.com/llvm/llvm-project/blob/d119d43e92333966125755353f4e6227dd2c70da/mlir/lib/Dialect/Vector/IR/VectorOps.cpp#L4118-L4129) has no access to that information and instead looks at the permutation map of the masked op, `vector.transfer_read`. And that yields `vector<i1>`. 

**SOLUTION**
TODO
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWF9v47gR_zT0C7GCSEqW_eAHJ47bBdJzkeQO6JNASSObXZlUSSqJ79MXJG398dq7OFzvrUAgW-Rw5seZ-c2Mw40RewmwQukDSjcz3tmD0quCS14evpiWlzArVHVaIbpGdP3y9M-X3ebXx6eX8I7iDYrXaB6Hv2MjdFiqO1lG7oFREr9DaZUWv3MrlMwtGIvoAtEUPq3mpYUqN40oASO2xhakURqxR_JJPtmnYBSxJ0QfMaIp1_vYyQhZweewRq_X2EQR_UymapIrO-xzLLLEXxB7uoMjezjfmKZljBHbYK6FPUSlksZyabFbXAcsQfAivvDSjZC82Ud7kKBFiXt9OOAXcp8feWu8LEofeF0LCW4JsUdEF1XsrlAR_6Q91ls7Dm26QfSxN2BBc6t0bk8t9BYQpS3XvGmgQZQGB_1kJd2gbHPWqjpr_nAsl-N7o_SpKOKgQ3XWh5KFC6wvMthFbTl2oPdWcDYeefss2mPxRwKK6LwY8sNdPF2e86GMUbrB93JmrJgko5h7u3nJjb226a-ArZpAm_iI3kGWBGRlfIZGkhtfrsFeZXAP-Oyrk4Cm-t588PMlCtnmR2k_SmWMNdhOS3zO6XtRDkcuiXJUVdcA5tZqUXTW5V_2YDWXplb6GH0Ie8glP7rsgf90IEtwiEZpMshOxVx5yfN-Nz9yIUMycb0nDh6iZDjM5SlX7dS4Bl4p2Zwc2Glu-pgHmg_ixuqutJ2GKjpyWx6wao0747k05XfPFizkuVYRHDAtbsAa8fn7vUkaXkHag81brkFafzmPOXsQRjXcRbvW6pjzQr0Hp_5p-1tEt84Kve-ZS8WHvDyIptIgcy6rnLdtc8pbbi1oacJVHFQ8yMsq7_nwZ9H-GFpwVHjNjfjdJWX64CuofwSiZQ-3sA3QfuCpYcdT8FIHLpwYvlza57RluL-nl5fdC37e_e1muy24OYSlKEJ0W4sGIt-C2Zq4RkkWDiNo7ejpwGbhLtGRm2-IZli1GD5bKK3BfNhG7FEQX6kz7CRxrTS2B_AvvGgAq9b1EqHkjdr2v6u5KH36-e2ksuA-DQAuO-1IMILHhgZMstDzKJ04gYZqMffUp4txZ6IpS67OhJCCzl3R6A9zvWeX8pyev_QVPGhmj46RMi9UJ6u--1rdgZObfPquHe4gq1fYH0Ha15CevvdofnIRYn7s8ZlK-5yNz1NOC_rY2TBvHXkbTt4dJoYxIg5KwsPPENnGbwUa3pmp-tlr-MKG6aRPKtcfrkPcezZQ5OJRlgSveau3FYxQO6g9pe6fI-NjP0D1U1auf1k__-v16-uElP75dhAGC8cml11Q4cnoi1WN0TyekgPNY_xxAA2eYrsW2wO3uOai8ZqgrqG04h2aE-bYlLzhGrvsw4g-4EIrXrkRJLo7iY9q9lT5BNpfSeSbnnw7gPNFX1zX5s2ffxq84rsod8OCdW59542ohjNXXJzHA9lvOOH_ZP7LyXwzzM9-IBr6u3YKPw4gcamBWyH3o4j6ijz3iIfYByrhghuosJLYOE-UuFGqxaFrc1lhIdvOns2E5Qh7Mu7FO5hRH1gP9q4qwzyOxsj_rj7gPeBF6YOQNei3c8Ls2n9w8-3t1Ib4Lg7Wtt6Gp9le2ENXRKU6IrptmvfLx5dWq39DaRHdFo0qEN1WhCyrhMGSMsaW8zmhaZamLGV1AnNKs6qiZRZXHNGtT2W6bYQ7txG8CYp-Cxeh268v_duuNVHZtoiy54SQxZfnhNClC-CBGywV5mUJxrifJ74YCOlGlFCfgieNdeWlUeqbwdz6sjRKQexSUNX9QOCi0nov3WVmhNeyCtZ8mTeTGPTex99X2tfd869vX3e_jCvt226zm1UrVi3Zks9gRTJG0mxJWDo7rBgsyIJVLM7SepmUdcmypMgKmtSLLJkny5lY0ZgmhJCExISSNCKQlQs6jwmjJF0AoCSGIxdN5GIWKb2fCWM6WBEyJ8ts1vACGrMK476ED-x3w6Q_0ysf6KLbG5TEjTDWDGqssI3_P0v4keDK5m-T7uDKcqcBf7n4FVE6Lvi1ahr1ARUuTkPhR5TOOt2s_nAOetwG0e35Yu8r-t8AAAD__zLCT5w">