[llvm] [AMDGPU] fix SIPeepholeSDWA optimization for fp16 (PR #109395)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 20 06:58:43 PDT 2024


arsenm wrote:

> yes, the final assembly is having different set of registers. but the subword operands are the same.
> 
> From my finding, both are doing the same job as you mentioned. not sure what goes wrong here my be should i use WORD-1 as a first operand?

You've still only provided fragments of information, so I have no idea what is going on. Can you reduce this, preferably to just the relevant instructions with runnable output 



https://github.com/llvm/llvm-project/pull/109395


More information about the llvm-commits mailing list