[llvm] [AMDGPU] fix SIPeepholeSDWA optimization for fp16 (PR #109395)

Pankaj Dwivedi via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 20 07:19:29 PDT 2024


PankajDwivedi-25 wrote:

> > yes, the final assembly is having different set of registers. but the subword operands are the same.
> > From my finding, both are doing the same job as you mentioned. not sure what goes wrong here my be should i use WORD-1 as a first operand?
> 
> You've still only provided fragments of information, so I have no idea what is going on. Can you reduce this, preferably to just the relevant instructions with runnable output

sure, let me get minimal MIR for the same.

https://github.com/llvm/llvm-project/pull/109395


More information about the llvm-commits mailing list