<table border="1" cellspacing="0" cellpadding="8">

    <tr>

        <th>Issue</th>

        <td>

            <a href=https://github.com/llvm/llvm-project/issues/148679>148679</a>

        </td>

    </tr>

    <tr>

        <th>Summary</th>

        <td>

            [MLIR][PadTilingInterface] Incorrect input padding for convolutions

        </td>

    </tr>

    <tr>

      <th>Labels</th>

      <td>

            mlir

      </td>

    </tr>

    <tr>

      <th>Assignees</th>

      <td>

      </td>

    </tr>

    <tr>

      <th>Reporter</th>

      <td>

          yzhang93

      </td>

    </tr>

</table>

<pre>

    **Description:**

When padding convolution-style ops using affine indexing maps of the form (dX + dY), the current logic in `computePaddedShape` incorrectly computes the required input padding. Specifically, directly using the affine map result (dX_size + dY_size) without considering the actual extent of kernel coverage leads to incorrect padding for convolved dimensions.

**Example:**

For the following `conv_2d_nhwc_fhwc` op, if I only pad `f` and `c` dimensions with multiples of {0, 0, 0, 32, 0, 0, 32}, it also pads the convolved dimensions and generates:

```

%padded_2 = tensor.pad %padded low[0, 0, 0, 0] high[0, 1, 1, 1] {

  ^bb0(%arg0: index, %arg1: index, %arg2: index, %arg3: index):

    tensor.yield %cst_1 : bf16

  } : tensor<16x26x19x287xbf16> to tensor<16x27x20x288xbf16>

  %padded_4 = tensor.pad %4 low[0, 0, 0, 0] high[1, 0, 0, 1] {

  ^bb0(%arg0: index, %arg1: index, %arg2: index, %arg3: index):

    tensor.yield %cst_3 : bf16

  } : tensor<287x3x3x287xbf16> to tensor<288x3x3x288xbf16>

  %padded_6 = tensor.pad %6 low[0, 0, 0, 0] high[0, 0, 0, 1] {

  ^bb0(%arg0: index, %arg1: index, %arg2: index, %arg3: index):

    tensor.yield %cst_5 : f32

  } : tensor<16x24x17x287xf32> to tensor<16x24x17x288xf32>

  %7 = linalg.generic {indexing_maps = [affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1 + d4, d2 + d5, d6)>, affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d3, d4, d5, d6)>, affine_map<(d0, d1, d2, d3, d4, d5, d6) -> (d0, d1, d2, d3)>], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction", "reduction"]} ins(%padded_2, %padded_4 : tensor<16x27x20x288xbf16>, tensor<288x3x3x288xbf16>) outs(%padded_6 : tensor<16x24x17x288xf32>) {

  ^bb0(%in: bf16, %in_7: bf16, %out: f32):

 %10 = arith.extf %in : bf16 to f32

    %11 = arith.extf %in_7 : bf16 to f32

    %12 = arith.mulf %10, %11 : f32

    %13 = arith.addf %out, %12 : f32

    linalg.yield %13 : f32

  } -> tensor<16x24x17x288xf32>

```

</pre>

<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzMVs1u4zYQfhr6QsSQKFGyDjokcQIEaIFFt0Dbk0GLI4ktRaok5cj79AVJ2XGyzmZR7GGBgLHmfz7NfCKzVnQKoEb0DtHtik2u16Y-fumZ6qpstdf8WCNyi8jtFmxjxOiEVii7jTKU-L8_elB4ZJwL1eFGq4OWkze7se4oAevR4sl6HWtboQALxWH2zwMbLdYtdj3gVpsBI7Lhf2JE7jD_C5EKkfugayZjQDksdScaLBRGRdLoYZwcfGKcA__csxFQkWChGm0MNE4e8WJhQwgD_07CAMdCjZM7VbvGn0doRCsaJuXRp-Ni8Y4Ve9el6oGN2ICdpItl7qz4Akut4TciFX4WrteT8yhYwcGcYzRuYhLD7HwfusX_gFEgcaMPYFgHWALjFjv90sEZ0VabBdUDcMzFAMoKrew6oh_fxMPMhlHCmzfzqM0CrpT62QcLyKnDjvCd6p-bXds_Nx44PfruRYufsFby6JN729brmAq_g91L-tArHibpxCghvEdU3iU-zMuRka-ey23I5DCTVvs88QVd6zBk7kCBYQ6s7y12XCTLn--ejmEEdgSjbIsdKKvNOpR_UmGpnxF9W1qC6Bb3outPqvTioFvfDEpuMUb0Yb9PENkgQpnpEpTdxgn2llGWXpGRK7LsQlbFdjDGp5qPAmSourFul2Jvu2_TIhZRboMgmqLsPi1mUsxpNZNNOQez7MGPzyuDcibJTDabk0EMdUYsv4JY_jFY6WvVTwBW9gFYHqNszt7FykMU9e9BVVyBqvjOufqpoKIBmTYj749VPqdlgMpbXZmqRb9Z9CekygCRFIrJbh2WVjS-2RPb7wLbextE7yKnehHK7j2dBnx4GC0eOINn4czDScNZeIa98SVdeEQGjmYkPrxY-_LIPf4B2a7rf1z8qx4hPl340pOgNjt3HOEMIyJkZIZJCRIREgfif0oM8KkJH_fvFtGtnx-hbBziEw8vc3lBMrcf0JL_zH9rF0mF9eRepyneHdzzYJLq6rIJdWaLWKlQu_KNSE_utCfnjUKEpklAnhnh-jXMro3uZ_rxu3LarbAUaXrVYVd-w4VcuAyTbGPipbA0fbXA0SO78GCct6cOFhfyxmXZ0TMvpNlXpBDm8oOtP3-GV7zOeJVVbAV1WlIfkBCy6mvKKpJsYJ8XVdtknFOaUijTTU73eVKUbCVqkhCalGmelmmWkzXLU9LQclOSlkGTc5QnMDAh11IehrU23UpYO0Gd5puirFaS7UHaOm7CIIWJY7kytbe_2U-dRXkihXX2JYITToYL76-_PP3mh5jefWL8dyGF6p6UA9OyBjxTP52vYq8ujRcXsnDNtavJyLp3bgxXFPKIyGMnXD_t140eEHn0mZd_N6PRf0PjEHkMjVhEHpdeDjX5LwAA__8tg1A0">