[llvm] [AMDGPU] fix SIPeepholeSDWA optimization for fp16 (PR #109395)
    Pankaj Dwivedi via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Fri Sep 20 07:19:29 PDT 2024
    
    
  
PankajDwivedi-25 wrote:
> > yes, the final assembly is having different set of registers. but the subword operands are the same.
> > From my finding, both are doing the same job as you mentioned. not sure what goes wrong here my be should i use WORD-1 as a first operand?
> 
> You've still only provided fragments of information, so I have no idea what is going on. Can you reduce this, preferably to just the relevant instructions with runnable output
sure, let me get minimal MIR for the same.
https://github.com/llvm/llvm-project/pull/109395
    
    
More information about the llvm-commits
mailing list