[llvm] [AMDGPU] fix SIPeepholeSDWA optimization for fp16 (PR #109395)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 20 06:58:43 PDT 2024
arsenm wrote:
> yes, the final assembly is having different set of registers. but the subword operands are the same.
>
> From my finding, both are doing the same job as you mentioned. not sure what goes wrong here my be should i use WORD-1 as a first operand?
You've still only provided fragments of information, so I have no idea what is going on. Can you reduce this, preferably to just the relevant instructions with runnable output
https://github.com/llvm/llvm-project/pull/109395
More information about the llvm-commits
mailing list