[llvm] [AMDGPU] fix SIPeepholeSDWA optimization for fp16 (PR #109395)
Pankaj Dwivedi via llvm-commits
llvm-commits at lists.llvm.org
Mon Dec 16 23:01:10 PST 2024
PankajDwivedi-25 wrote:
> SDWA version is still equivalent in what you've shown here for the reasons arsenm outlined earlier.
Is it expected to be the same? Also, in the first case, I can see the combination of two encodings: 64 in v_cmp_ge_i32_e64 and 32 in v_cmp_lt_i32_e32. Does it work correctly? If so, do you have any idea what wrong could go here?
https://github.com/llvm/llvm-project/pull/109395
More information about the llvm-commits
mailing list