[llvm] [NVPTX] Disable incorrect peephole optimizations (PR #79920)

Artem Belevich via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 8 17:32:16 PST 2024


Artem-B wrote:

This should fix the issue: https://github.com/llvm/llvm-project/compare/main...Artem-B:llvm-project:bfe-fix?expand=1

While I was tinkering with this, there may be some opportunities for further optimizations for v4i8 types.
We could potentially use PRMT instruction to split it into v2i16, do vectorized ops on v2i16, and then pack results back into v4i8 with PRMT.

One remaining quirk is that the patterns above create an interesting quirk -- we end up extracting the same fields twice, with different signedness. Once, to be used in comparison, and another if the same field is used for something else (and uses the BFE lowered for extractelt).  I'll need to see what works best for ptxas:
- bfe.s32 + bfs.u32 
- bfe.u32 + sign-extend i8->i16.


https://github.com/llvm/llvm-project/pull/79920


More information about the llvm-commits mailing list