[llvm] [AMDGPU] Eliminate unnecessary packing in wider f16 vectors for sdwa/opsel-able instruction (PR #137137)
Vikash Gupta via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 22 02:46:52 PDT 2025
vg0204 wrote:
> > You mean at the machine instruction selection phase for the given DAG of vector <4 x [i//f]16> or the like!
>
> Yeah. It'd be nice to declare <4 x [i/f]16> versions of SDWA operations legal and then lower them to the version that doesn't need to do any packing
Considering such an target-specific as well as subtarget-specfic at an early stage would be bit tricky! Also what do we want to achieve is quiet a very specific optimixation, is it worth to define new stuff at ISEL level for that. I am not sure about it really!
@jayfoad , @frederik-h What are your thoughts on it?
https://github.com/llvm/llvm-project/pull/137137
More information about the llvm-commits
mailing list