[PATCH] D88572: AMDGPU/SelectionDAG Check for NaN, DX10Clamp and IEEE in fmed3 combine

Petar Avramovic via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 13 08:04:49 PDT 2020


Petar.Avramovic added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9773-9774
+    if (DAG.isKnownNeverNaN(Var) ||
+        (Info->getMode().DX10Clamp &&
+         (!Info->getMode().IEEE || DAG.isKnownNeverSNaN(Var))))
+      return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Var);
----------------
foad wrote:
> You're assuming that Info->getMode().DX10Clamp affects the behaviour of AMDGPUISD::CLAMP which I think is OK, though it would be nice if that was clearly documented (I can't understand the comment about dx10_enable in AMDGPUISelLowering.h).
> 
> You're also assuming that Info->getMode().IEEE affects the behaviour of the Op0 node which could be ISD::FMAXNUM or ISD::FMAXNUM_IEEE or FMAX_LEGACY. This seems wrong to me.
Check for Info->getMode().IEEE only for ISD::FMAXNUM. This node is known never SNaN when IEEE is off.


================
Comment at: llvm/test/CodeGen/AMDGPU/clamp.ll:415
 ; GCN: v_mov_b32_e32 v{{[0-9]+}}, 0{{$}}
 define amdgpu_kernel void @v_clamp_constant_snan_f32(float addrspace(1)* %out) #0 {
   %tid = call i32 @llvm.amdgcn.workitem.id.x()
----------------
isa<ConstantFPSDNode>(X) check in min max combine is there to cover existing test where X is NaN constant, min max will be folded to clamp and clamp will become constant in performClampCombine.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88572/new/

https://reviews.llvm.org/D88572



More information about the llvm-commits mailing list