<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/123630>123630</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [mlir] Improve code re-use in VectorEmulateNarrowType.cpp
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          banach-space
      </td>
    </tr>
</table>

<pre>
    Hi folks! 

Having reviewed a few recent PRs for [VectorEmulateNarrowType.cpp](https://github.com/llvm/llvm-project/blob/main/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp), e.g.:

* https://github.com/llvm/llvm-project/pull/115922, or
* #121298,

I believe it’s time to improve code reusability and unify naming conventions in the file. At ~1.7k lines, it's becoming increasingly difficult to review patches.


### Proposal 1

To begin with, I would like for us to start re-using the logic from [alignedConversionPrecondition](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L1084-L1109). This hook effectively checks **two conditions**:
1. [Per-element alignment](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L1096) (e.g. `i2` and `i8` are aligned, `i3` and `i8` are not aligned).
2. Could the input/source _vector of sub-byte scalar types_ (e.g. `vector<2xi4>`) fit within the target mulit-byte container type (e.g. `i8` or `i32`), this is checked [here](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L1103).

Similar checks are repeated throughout the code, e.g.

* **Cond 1:** [check 1](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L320), [check 2](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L391), [check 3](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L499) ... .
* **Cond 2:** [check 1](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L340), [check 2](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L398), [check 3](https://github.com/llvm/llvm-project/blob/3b001db4f9668cfa29572e5f1911ec7cef8b0ac2/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp#L535), ... .

In order to enable re-use, it would help to do some minor variable renaming first. Otherwise, it's quite hard to see that basically identical quantity is computed/defined multiple times throughout the code. Rather than implementing this in one big patch, I have split it into 4 independent changes:

* https://github.com/llvm/llvm-project/pull/123526
* https://github.com/llvm/llvm-project/pull/123527
* https://github.com/llvm/llvm-project/pull/123528
* https://github.com/llvm/llvm-project/pull/123529

These are basically stacked PRs. I've made sure to link the relevant commit (1 per PR) in the summary, so that it's easy for you to find it.

### Proposal 2

We should clarify the meaning of "source" and "destination" types. Currently, the usage is ambiguous. For example, for this `arith.extsi` Op,
```mlir
  %0 = arith.extsi %arg0 : vector<8xi2> to vector<8xi32>
```
`vector<8xi2>` and `vector<8xi32>` are the "source" and "destination" types, respectively (i.e. that's the interpretation we tend to use).

However, patterns like `RewriteAlignedSubByteIntExt` introduce `vector.bitcast` Ops like this:

```mlir
  %bitcast = vector.bitcast %arg0 : vector<8xi2> to vector<2xi8>
```

I've noticed that we tend to consider both:
*  `vector<2xi8>`(the result of `vector.bitcast`), and `vector<8xi32>` (the output of `arith.extsi`) 

as the destination type.  That should be clarified. I suggest two steps:
* Avoid "destination" and "source" when referring to the "emulation" logic. Reserve that for the original Op (e.g. `arith.extsi` above). Better still, use the naming from Op definitions (for [arith.extsi](https://mlir.llvm.org/docs/Dialects/ArithOps/#arithextsi-arithextsiop) that would mean `in` and `out` for "source" and "destination").
* For the "emulation" logic, use "emulated type" and "container name" for the original type (to be emulated, e.g. `i2`) and the target type that's used for emulation (e.g. `i8`).

WDYT?

### Proposal 3

Document the updated naming in the file.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzcWEtv5DYS_jX0pWBBovohHfrQY48RA8GO4Rgb7CmgpFKr1hSpkFS3-7K_fVGU-jEZTzbBzmUGMGA1JX6qx1fFr6S8p51B3IjlB7G8v1Fj6KzbVMqourv1g6rxprLNcfMTQWv1qxcyA5FuRbr9Se3J7MDhnvCADSho8QAOazQBnp49tNaBWH74J9bBuo_9qFXAfyjn7OHlOGBSD4NY3gtZdCEMXuRbIR-EfNhR6MYqqW0v5IPW-9O_28HZf2MdhHyotK2EfOgVGf6nyfEzxGv3pPT00PRaIR9enDK-ta7359X3jZGlkHeAyS5hY6KPQm7hb5s3jFoL-ZBly1JKhrRuhhIyz2Qmy0LIu-kFj1ChJtwjUBAfpShSUZYeAvUIwQL1g7N7hNo2CA5HryrSFI6gTAOjofYIRvWchtqaPZpA1nggA6FDaEljAtsA_8mS9StoMujZHApCrj1UWNu4lUztUHkyO32EhtqW6lEHfv2UWxhUqDv0yRwT9iWf_uDJ2cF6pSGb7rxYqHBHBg4UOn7ZIxzsqBvQ9IqREaNnZB-UC-DwduT3RnO13VENrbM9s0Zp5mVzx245T9Y8OaytaYhd_P94k1dpmjXVoi1Xq6JulSyXa4nLNiuzDOt1jW1RpaqW35Rb-c9ZWixuf86ytBSyTOClIw-dta-AbYt1oD3qI9Qd1q8ehGS-hIOFs89-WpuomSUcoyd0t6ix54KL8eKr7zQ25UrIEoQsuP5ArFKSYpVGnvOPIv5wCDMvmFq8nr_7kLHh8mDJvJUJ3EUeMtPIDCOb6u3oaoTf9tE4sC34sbqtjgHB10orB-E4oP_t2qzpWZHfyTdaiPyjWKVseEshUn4uvaDcDgP0o6YwAdbWBEUGJ8zPHI1mc69kf-QEyP4FZgj5iRPYcMY7dPh9JjhL8zkVIt3-Qj1xeGe2c8ocDqgCcoKcHXedHUOMJHe-c1s-9-SpFu6saSCLQZhWlx8iImTfZYxyecr82RH5fTpSZn90JP8uHVmU3KshSRJIvmSe_HGYt_hhmFf8GMxb5svZkTP50u2jAesaPkIsoFGVxklB4aTqZqXVoR74icaCtz1CT8Y62CtH845ZMbbkfEjgU-jQHegEEqXh7yMFhE65Jmo1RAidClApT7XS-gjUsNaslYbfR2UCK1I-qGw_jIHP3IcGWzLY8AkYaNAYFa1_r7kn8KzYBH6FYcE7CZpJFVIUs9YgVLSbdOgkKju1R_CDpsCOkwkWFkCmwQEN2wZ1p8wO_bdR8jJfytW3AFl_C5DiW4CUs17v0GM8gC_J9UFFvfH07BN4FHK9R-hVg-BHF6cSTeY15s-hxr3iaNu-p8CqJoMBHTw9c9-cxZAf-165IyfO24lJM89Q-WMcCo52ZOCWTAMUkq9OGXK68yuC7yLZa60cT0H8nh6VYdrYFoSUk7gTUk7yUMoGfSCj4vwg5STtErgbnUMT9HFSXAijVztkNqu-ot1oR5_Ag3WAb4q5yY-xxZGbYpUqR6FL8C14Yhn3aZhHu1U6_cWmkG4BhFymIPJ7uNrBi8rteH0LZ2VZvJEU-UcOyPVazovX0NP1H7ddCeIvd8_6mP38qyFihx364TyiCFlQgklMZMziJKkDusFhiJvhgBDQxO4R29NJ-P1kD7hHx5iDCgGd8dNkKFbpMx4cBdxOuv2XsfpwDPhowse3wHaTCc42Y40Xz5KKQq18mAI_I3FiLlX_XhrmXTEZnwP99XzINyreSUe6nevF2EB1VLMqXEejtsYTt_DKhm42U7JZn08WxWmyKKYy8zyQM6_fcX0-J_4s5zOOHcMwnnA-J24cvqIDakroFRUiDxKAF_ZlrrsK59IjbBJ4BD_udugD8MjqAw7-4tt2b-k9ds2cu5Dw0KEBhy06F7u_PfEU4yk5b4ufCRJ4Ro9uPx9MU0EiWEc7MkrDp-F6wvpDjarK7iMn4QMyB8EH4sZ4x1yNOKcT0tmeoeJhNg3hDDt_17pGfUdiMOMSbsCJdTs-EW3tL7qAL7cM8Gnw8fk8wkW028ulHTgxE4li3LnHxUHRXNW5HWMJRLv-Z1WfilFuY1v7aohP8TjfZDYfh2vgy0xrVB9vfJGI07AbLFPmBHQa6M5zPnvJoFezc9x57jGjxyainw39YoS-dJlf7__1IvKHr54i-XTn3tZj_HYS2_7QRBfn1F9_QrtpNnlT5qW6wU22ztd5sVqX6U23WVXtalG3q6JpFm2el-tcLQtVtUW-WFZyqW5oI1O5TDOZZku5ytNktSiLdJ3jar2ul1VaiEWKvSJ9JsoNeT_iJpP5Kk9vtKpQ-9O3WbeJ53k17rxYpJp88Jd9gYKOX3Fjp1vew-Pnnw5ZJrJXf6I7b0anN39bVESD_SQr2Ob9Rv43AAD__6NuJ6Q">