[PATCH] D140135: AMDGPU: Try to unfold fneg source when matching legacy fmin/fmax

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 21 04:39:20 PST 2022


foad added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll:295-297
-; SI-SAFE-NEXT:    v_mov_b32_e32 v1, s0
-; SI-SAFE-NEXT:    v_cmp_ngt_f32_e64 vcc, s0, 0
-; SI-SAFE-NEXT:    v_cndmask_b32_e32 v0, v0, v1, vcc
----------------
foad wrote:
> arsenm wrote:
> > foad wrote:
> > > I think these three lines implement roughly max(s0, 0) so how can converting it to min(0, s0) be correct?
> > It's ngt, not gt. It's more like !(v_cmp_le_f32)
> Right, the code is doing `s0<=0?v0:v1` which is `s0<=0?-0:s0` which is max.
Oh sorry I think I've misremembered how cndmask works.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140135/new/

https://reviews.llvm.org/D140135



More information about the llvm-commits mailing list