[llvm] [AMDGPU] Eliminate unnecessary packing in wider f16 vectors for sdwa/opsel-able instruction (PR #137137)

Krzysztof Drewniak via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 20 09:26:26 PDT 2025


krzysz00 wrote:

I do think solving the original problem - that is, less register-efficient lowerings of SWDA/OPSEL-able operations that're being run on a vector <4 x [i//f]16> or the like - should be done

https://github.com/llvm/llvm-project/pull/137137


More information about the llvm-commits mailing list