[PATCH] D140135: AMDGPU: Try to unfold fneg source when matching legacy fmin/fmax

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 20 23:30:56 PST 2022


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:1401
+}
+SDValue AMDGPUTargetLowering::combineFMinMaxLegacyImpl(
+    const SDLoc &DL, EVT VT, SDValue LHS, SDValue RHS, SDValue True,
----------------
This function doesn't use False. Might be clearer to take a Swapped flag instead of True and False.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:1487
+
+  // Undo the combine foldFreeOpFromSelect does if it helps us match the min/max
+  if (LHS == NegTrue && CFalse && CRHS) {
----------------
It's very unclear what you're trying to match here. I think it's something like this, depending on whether NegTrue actually stripped a neg or not:

    // LHS op RHS ? LHS : -RHS -> -min/max(LHS, RHS)
    // LHS op RHS ? -LHS : -RHS -> -min/max(LHS, RHS)

The first one doesn't make any sense to me. For the second one don't you need to flip the condition code?


================
Comment at: llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll:295-297
-; SI-SAFE-NEXT:    v_mov_b32_e32 v1, s0
-; SI-SAFE-NEXT:    v_cmp_ngt_f32_e64 vcc, s0, 0
-; SI-SAFE-NEXT:    v_cndmask_b32_e32 v0, v0, v1, vcc
----------------
I think these three lines implement roughly max(s0, 0) so how can converting it to min(0, s0) be correct?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140135/new/

https://reviews.llvm.org/D140135



More information about the llvm-commits mailing list