[llvm] [AMDGPU] fix SIPeepholeSDWA optimization for fp16 (PR #109395)
Pankaj Dwivedi via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 20 07:19:29 PDT 2024
PankajDwivedi-25 wrote:
> > yes, the final assembly is having different set of registers. but the subword operands are the same.
> > From my finding, both are doing the same job as you mentioned. not sure what goes wrong here my be should i use WORD-1 as a first operand?
>
> You've still only provided fragments of information, so I have no idea what is going on. Can you reduce this, preferably to just the relevant instructions with runnable output
sure, let me get minimal MIR for the same.
https://github.com/llvm/llvm-project/pull/109395
More information about the llvm-commits
mailing list