[PATCH] D88572: AMDGPU/SelectionDAG Check for NaN, DX10Clamp and IEEE in fmed3 combine

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 14 07:36:33 PDT 2020


foad added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9773-9774
+    if (DAG.isKnownNeverNaN(Var) ||
+        (Info->getMode().DX10Clamp &&
+         (!Info->getMode().IEEE || DAG.isKnownNeverSNaN(Var))))
+      return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Var);
----------------
Petar.Avramovic wrote:
> foad wrote:
> > You're assuming that Info->getMode().DX10Clamp affects the behaviour of AMDGPUISD::CLAMP which I think is OK, though it would be nice if that was clearly documented (I can't understand the comment about dx10_enable in AMDGPUISelLowering.h).
> > 
> > You're also assuming that Info->getMode().IEEE affects the behaviour of the Op0 node which could be ISD::FMAXNUM or ISD::FMAXNUM_IEEE or FMAX_LEGACY. This seems wrong to me.
> Check for Info->getMode().IEEE only for ISD::FMAXNUM. This node is known never SNaN when IEEE is off.
As I understand it, the behaviour of ISD::FMAXNUM should not depend on the IEEE mode. (Yes we currently codegen it to an instruction that behaves differently depending on the IEEE mode bit, but that's a bug that could be fixed, if anyone cared enough to fix it.)


================
Comment at: llvm/test/CodeGen/AMDGPU/clamp.ll:415
 ; GCN: v_mov_b32_e32 v{{[0-9]+}}, 0{{$}}
 define amdgpu_kernel void @v_clamp_constant_snan_f32(float addrspace(1)* %out) #0 {
   %tid = call i32 @llvm.amdgcn.workitem.id.x()
----------------
Petar.Avramovic wrote:
> isa<ConstantFPSDNode>(X) check in min max combine is there to cover existing test where X is NaN constant, min max will be folded to clamp and clamp will become constant in performClampCombine.
This test is wrong then, isn't it? It's checking that fmed3(0,1,snan) -> 0, with DX10Clamp=1 and IEEE=1. But that's the wrong answer. fmed3(0,1,snan) is qnan in IEEE mode. This test is relying on an incorrect folding from fmed3 to clamp.

Maybe the test should be changed to put the 0 last, i.e. fmed3(snan,1,0) ?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88572/new/

https://reviews.llvm.org/D88572



More information about the llvm-commits mailing list