[PATCH] D88572: AMDGPU/SelectionDAG Check for NaN, DX10Clamp and IEEE in fmed3 combine

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 13 05:15:40 PDT 2020


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9768
+    if (Op0->getFlags().hasNoNaNs() || DAG.getTarget().Options.NoNaNsFPMath)
+      return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Var, Op0->getFlags());
+
----------------
Is there a good reason to keep flags from the inner node, here, but not on line 9775?


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9773-9774
+    if (DAG.isKnownNeverNaN(Var) ||
+        (Info->getMode().DX10Clamp &&
+         (!Info->getMode().IEEE || DAG.isKnownNeverSNaN(Var))))
+      return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Var);
----------------
You're assuming that Info->getMode().DX10Clamp affects the behaviour of AMDGPUISD::CLAMP which I think is OK, though it would be nice if that was clearly documented (I can't understand the comment about dx10_enable in AMDGPUISelLowering.h).

You're also assuming that Info->getMode().IEEE affects the behaviour of the Op0 node which could be ISD::FMAXNUM or ISD::FMAXNUM_IEEE or FMAX_LEGACY. This seems wrong to me.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9880
+  // NaNs. med3_f32(a, b, c) is equivalent to min_f32(min_f32(a,b), c) when one
+  // of the a, b or c in NaN. Result is sensitive to the order of inputs in IEEE
+  // mode when one of them is SNaN.
----------------
Typo "is NaN".


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9897
+  }
+  // Find index of the operand with 1.0, remainig index is X.
+  if (!isOperandExactlyValue(N, OneIdx, 1.0)) {
----------------
Typo "remaining".


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9907
+  if (N->getFlags().hasNoNaNs() || DAG.getTarget().Options.NoNaNsFPMath)
+    return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, X, N->getFlags());
 
----------------
Why do you keep flags here but not on line 9925?


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9910
+  // It is safe to clamp when X is not NaN.
+  // Based on v_min_f32 behaviour, IEEE and DX10Clamp settings:
+  // X=SNaN and IEEE=true: min(SNaN, Val) -> QNaN, min(QNaN, Val) -> Val
----------------
You're assuming that the behaviour of AMDGPUISD::FMED3 depends on Info->getMode().IEEE, which seems reasonable.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9912
+  // X=SNaN and IEEE=true: min(SNaN, Val) -> QNaN, min(QNaN, Val) -> Val
+  //   med3 returns either QNaN, 1.0 or 0.0 deppending on the inputs.
+  //   DX10Clamp=true:  clamps SNaN to 0.0, ok when med3's operand(2) is 0.0.
----------------
Typo "depending".


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9922
+  const SIMachineFunctionInfo *Info = MF.getInfo<SIMachineFunctionInfo>();
+  if (DAG.isKnownNeverNaN(X) || isa<ConstantFPSDNode>(X) ||
+      (Info->getMode().DX10Clamp &&
----------------
A constant could be a NaN, but isKnownNeverNaN should already have handled the constant case. So I don't think you need the check for isa<ConstantFPSDNode>(X) at all.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88572/new/

https://reviews.llvm.org/D88572



More information about the llvm-commits mailing list