[PATCH] D140135: AMDGPU: Try to unfold fneg source when matching legacy fmin/fmax
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 20 23:30:56 PST 2022
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:1401
+}
+SDValue AMDGPUTargetLowering::combineFMinMaxLegacyImpl(
+ const SDLoc &DL, EVT VT, SDValue LHS, SDValue RHS, SDValue True,
----------------
This function doesn't use False. Might be clearer to take a Swapped flag instead of True and False.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:1487
+
+ // Undo the combine foldFreeOpFromSelect does if it helps us match the min/max
+ if (LHS == NegTrue && CFalse && CRHS) {
----------------
It's very unclear what you're trying to match here. I think it's something like this, depending on whether NegTrue actually stripped a neg or not:
// LHS op RHS ? LHS : -RHS -> -min/max(LHS, RHS)
// LHS op RHS ? -LHS : -RHS -> -min/max(LHS, RHS)
The first one doesn't make any sense to me. For the second one don't you need to flip the condition code?
================
Comment at: llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll:295-297
-; SI-SAFE-NEXT: v_mov_b32_e32 v1, s0
-; SI-SAFE-NEXT: v_cmp_ngt_f32_e64 vcc, s0, 0
-; SI-SAFE-NEXT: v_cndmask_b32_e32 v0, v0, v1, vcc
----------------
I think these three lines implement roughly max(s0, 0) so how can converting it to min(0, s0) be correct?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140135/new/
https://reviews.llvm.org/D140135
More information about the llvm-commits
mailing list