[PATCH] D35218: [AMDGPU] fcanonicalize elimination optimization

Tue Jul 11 11:23:45 PDT 2017

arsenm added inline comments.

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4671-4672
+  case ISD::FMAXNAN:
+    if (ST->getGeneration() >= SISubtarget::GFX9)
+      return true;
+
----------------
rampitec wrote:
> arsenm wrote:
> > rampitec wrote:
> > > arsenm wrote:
> > > > rampitec wrote:
> > > > > arsenm wrote:
> > > > > > I don't think this is true, but should have a named check in the subtarget
> > > > > I would rather think about denorm support flag in TD for every single instruction wrt subtarget. Why add just a single one?
> > > > We don't need a full fledged subtarget feature, just put the generation check in a function with name/description rather than adding more random looking generation checks
> > > I'm not sure I follow. Could you please describe a name of such check?
> > hasAddr64() or hasMed3_16()
> hasNormalizingMinMax()? It returns us to the initial point.
The SC name was SupportsMinMaxDenormModes. Either way works

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4636
+  case ISD::FMA:
+  case ISD::FMAD:
+
----------------
Since FMAD always flushes I don't think it's OK to handle it

================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4696-4699
+    bool IsIEEEMode = Subtarget->enableIEEEBit(DAG.getMachineFunction()) ||
+                      getTargetMachine().Options.NoNaNsFPMath ||
+                      N->getFlags().hasNoNaNs();
+
----------------
This is really a check for whether SNaNs are enabled. We have the isKnownNeverSNan check for this already. I think we need to fix having separate FPExceptions and enableIEEEBit subtarget checks

================
Comment at: test/CodeGen/AMDGPU/fcanonicalize-elimination.ll:2
+; RUN: llc -march=amdgcn -mcpu=tonga -verify-machineinstrs -mattr=-fp32-denormals < %s | FileCheck -check-prefix=GCN -check-prefix=VI %s
+; RUN: llc -march=amdgcn -mcpu=gfx901 -verify-machineinstrs -mattr=+fp32-denormals < %s | FileCheck -check-prefix=GCN -check-prefix=GFX9 %s
+
----------------
This needs a run line for gfx9 with denormals disabled too

https://reviews.llvm.org/D35218