[llvm] [NVPTX] Optimize v16i8 reductions (PR #67322)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 28 11:21:31 PDT 2023


Artem-B wrote:

> whether the general improvement in v4i8 lowering would be sufficient to address this particular scenario, too.

Well, that brought less benefit than I hoped for. Making v4i8 a legal type helps to avoid issues in other areas but does not help much with this particular case.  Your changes are still useful and needed.

> Is it worth separating out the `i8` extraction part from these changes and only keep the `v16i8` load part for now? I have not updated this PR with your suggestions yet as it may need reworking anyway after your `v4i8` work

Considering that these improvements are independent, it may indeed be a good idea to split them into separate patches.



https://github.com/llvm/llvm-project/pull/67322


More information about the llvm-commits mailing list