[llvm] [AMDGPU] fix SIPeepholeSDWA optimization for fp16 (PR #109395)

Pankaj Dwivedi via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 16 23:01:10 PST 2024


PankajDwivedi-25 wrote:

> SDWA version is still equivalent in what you've shown here for the reasons arsenm outlined earlier.

Is it expected to be the same? Also, in the first case, I can see the combination of two encodings: 64 in v_cmp_ge_i32_e64 and 32 in v_cmp_lt_i32_e32. Does it work correctly? If so, do you have any idea what wrong could go here?

https://github.com/llvm/llvm-project/pull/109395


More information about the llvm-commits mailing list