<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/120733>120733</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [mlir] tiling: Invalid slice when #map(0) != 0
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            mlir:linalg,
            mlir
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          mgehre-amd
      </td>
    </tr>
</table>

<pre>
    In the [reproducer](https://godbolt.org/z/fhen1svYd),
```
func.func @test(%arg0 : tensor<9xf32>) -> tensor<6xf32> {
  %empty = tensor.empty() : tensor<6xf32>
  %generic = linalg.generic
    {indexing_maps = [affine_map<(d0) -> (d0 + 3)>,
 affine_map<(d0) -> (d0)>],
     iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) {
    ^bb0(%in : f32, %out: f32):
      linalg.yield %in : f32
    } -> tensor<6xf32>
  return %generic : tensor<6xf32>
}
module attributes {transform.with_named_sequence} {
  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
    %1, %loop = transform.structured.tile_using_for %0 tile_sizes [3] : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
    transform.yield
 }
}
```
becomes
```
#map = affine_map<(d0) -> (d0 + 3)>
#map1 = affine_map<(d0) -> (d0)>
module {
  func.func @test(%arg0: tensor<9xf32>) -> tensor<6xf32> {
    %0 = tensor.empty() : tensor<6xf32>
    %c0 = arith.constant 0 : index
    %c6 = arith.constant 6 : index
    %c3 = arith.constant 3 : index
    %1 = scf.for %arg1 = %c0 to %c6 step %c3 iter_args(%arg2 = %0) -> (tensor<6xf32>) {
      %2 = affine.apply #map(%arg1)
      %extracted_slice = tensor.extract_slice %arg0[%2] [6] [1] : tensor<9xf32> to tensor<6xf32>
      %extracted_slice_0 = tensor.extract_slice %arg2[%arg1] [3] [1] : tensor<6xf32> to tensor<3xf32>
 %3 = linalg.generic {indexing_maps = [#map, #map1], iterator_types = ["parallel"]} ins(%extracted_slice : tensor<6xf32>) outs(%extracted_slice_0 : tensor<3xf32>) {
      ^bb0(%in: f32, %out: f32):
        linalg.yield %in : f32
      } -> tensor<3xf32>
      %inserted_slice = tensor.insert_slice %3 into %arg2[%arg1] [3] [1] : tensor<3xf32> into tensor<6xf32>
      scf.yield %inserted_slice : tensor<6xf32>
    }
 return %1 : tensor<6xf32>
  }
```
This accesses out-of-bounds. `%2 = affine.apply #map(%arg1)` is `%arg1 + 3` (which takes the values `3` and `6` for the two loop iterations`
and `%extracted_slice = tensor.extract_slice %arg0[%2] [6] [1] : tensor<9xf32> to tensor<6xf32>` will extract `6` elements from that offset,
which tries to extract elements `6` to `12` in the second iteration - but the tensor only has 9 elements.

It seems that the implemenation computing the slices is only correct when `#map(0)=0`. The correct offset for the extract slice would be `#map(%arg1) - #map(0)`, which here is `%arg1`.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzEV02T2jgQ_TXiosJlS2MDBw58hKrcc9kTJdttrF1Z8kryTMiv39KHDcyYZJLLVk0BI3WrX79uvbaZMfwiAbYo36P8uGCDbZXedhdoNSxZVy9KVV-3XyW2LWCU7zX0WtVDBRrlR0TWrbW9QXSHyAmR00XVpRI2UfqCyOkHIqemBZmZ179qRDaIHFC6Q0Ua_9JdM8gqcR8YvaQWjEVkjUjO9CXFiO6wBWmURvSw-d5QgugXRDZ4ieiX204RdzBa7VG6wxiRHLreXjGix2iW-AV_9ubx3NF79LyABM0r7yu4ZOKSxCVvgV0ULmv4zuXl3LHeeEuU71nTcAluCdEDIus6naD6_zAie0wdCS4JxwP-lUs0djQfYnTMLWhmlT7baw9TcERIzzQTAgQixHmsjphLc2PzKZlqsNFsJG2OHUdbpBdjlH8pyzQ4cek9nBk5OALVYKeFjeuLCHxk88pB1PjRc2T2-KS03kCDHbR8rNF8IdHqiNJdp-pBAGbWal4O1pG12lvNpGmU7pI3btuzZB3UZwP_DiArcADGLG-GjzauT8_naffcMS4nljOHCJHs5szk9az6x8gaWK2kuDqYj7ySPA1NO9kaq4fKDhrqpGO2arHqjXPwJX_Xn3eFxxEPDoDWM5juWu3j3oQni1UVymXxDJrlAs6DcVeiUTrk4dcM_-F4z_cU5cfPgZnfPszCJJuI9Lbj28uthiaIn3eKU0KlOjDvVhGhHQsZ_tZFHj2zT7lOTrE3x9L_RAX_VATvu-l3JND7VcGRaW7bpFLSWCYtDorsxe_OtJgzLWZN6ZwpnTMNdJqqSWI_xWY-RnRWxdjGQh_Pdsp4ZvpyEz0yejyU4Rfi5uOTu3ImrO_FFYc63131qfnCxPluNausUwrBnUzcER-2xo1YV3eDc-KvRb4v4nc2XpMPJXcpP6vYHIBz-isIJEDwuYTo9CmKYg4FvUOBSE5nZubTaTmyeYi8ZmHK_dF4-0j9kwl2N-pmyNrNZPa-Mx4G3yfn3icm39zsozM15tKAnu2xsHOrL8Vchkvym5Uewwb_5x3n7uZdQu9w_URdvCDfDfPsZ89jH9T7W8sNZlUFxoBxBV2qZlmqQdYmwc7k03e3SDE30SWoi5f1wun7-q3lVYst-weMf-59ZWIAb-0tmKzd78L9dgLlTOybwn5KhhbmSpoAORr_bxpRpPiNC4FjiAk4COhAWoMbrTpsW2axahoDNjxtRgo0dxSoyXvyGo9xXVakGfGEhrcEA5WS9Y0IvMTlYANLHh52jz-4ZQZvpgMTV-l099ViA9CZAMi58K73JuGoSnX9YLm8hEiOLuMK6U-slNZQWfzWggycx7qHyXt0TZTgby1MliHjqYhjlqEMb2oQNS7h4ahbC-ElfgzgrA44ENeChscGc7EX9ZbWG7phC9hmK_qyStMiWy_abQq0JmVJKyjTNdRlDllJ64K-5KtVXhXlgm9JSl4yQtKMZkW-TjbVesOaNCtJulmVmwa9pNAxLhIhXjv3_rXgxgywzUi6onQhWAnCbIOYdoJrRHdBl5yiksO46uV1obfulGU5XAx6SQU31tzOtdwK_77oHfKje9Tj8uJa8qt8ZYLXI3--DPcUuac41_XpYtBi--7Vkdt2KJNKdYicXLD4tey1-hsqi8jJZ2QQOcWkXrfkvwAAAP__18RSXg">