[PATCH] D88572: AMDGPU/SelectionDAG Check for NaN, DX10Clamp and IEEE in fmed3 combine
Petar Avramovic via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 13 08:04:49 PDT 2020
Petar.Avramovic added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:9773-9774
+ if (DAG.isKnownNeverNaN(Var) ||
+ (Info->getMode().DX10Clamp &&
+ (!Info->getMode().IEEE || DAG.isKnownNeverSNaN(Var))))
+ return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Var);
----------------
foad wrote:
> You're assuming that Info->getMode().DX10Clamp affects the behaviour of AMDGPUISD::CLAMP which I think is OK, though it would be nice if that was clearly documented (I can't understand the comment about dx10_enable in AMDGPUISelLowering.h).
>
> You're also assuming that Info->getMode().IEEE affects the behaviour of the Op0 node which could be ISD::FMAXNUM or ISD::FMAXNUM_IEEE or FMAX_LEGACY. This seems wrong to me.
Check for Info->getMode().IEEE only for ISD::FMAXNUM. This node is known never SNaN when IEEE is off.
================
Comment at: llvm/test/CodeGen/AMDGPU/clamp.ll:415
; GCN: v_mov_b32_e32 v{{[0-9]+}}, 0{{$}}
define amdgpu_kernel void @v_clamp_constant_snan_f32(float addrspace(1)* %out) #0 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()
----------------
isa<ConstantFPSDNode>(X) check in min max combine is there to cover existing test where X is NaN constant, min max will be folded to clamp and clamp will become constant in performClampCombine.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D88572/new/
https://reviews.llvm.org/D88572
More information about the llvm-commits
mailing list