[PATCH] D35218: [AMDGPU] fcanonicalize elimination optimization

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 11 13:23:00 PDT 2017


rampitec marked 4 inline comments as done.
rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4671-4672
+  case ISD::FMAXNAN:
+    if (ST->getGeneration() >= SISubtarget::GFX9)
+      return true;
+
----------------
arsenm wrote:
> rampitec wrote:
> > arsenm wrote:
> > > rampitec wrote:
> > > > arsenm wrote:
> > > > > rampitec wrote:
> > > > > > arsenm wrote:
> > > > > > > I don't think this is true, but should have a named check in the subtarget
> > > > > > I would rather think about denorm support flag in TD for every single instruction wrt subtarget. Why add just a single one?
> > > > > We don't need a full fledged subtarget feature, just put the generation check in a function with name/description rather than adding more random looking generation checks
> > > > I'm not sure I follow. Could you please describe a name of such check?
> > > hasAddr64() or hasMed3_16()
> > hasNormalizingMinMax()? It returns us to the initial point.
> The SC name was SupportsMinMaxDenormModes. Either way works
This code part is dropped for now.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4636
+  case ISD::FMA:
+  case ISD::FMAD:
+
----------------
arsenm wrote:
> Since FMAD always flushes I don't think it's OK to handle it
FMAD - Perform a * b + c, while getting the same result as the separately rounded operations.
It is essentially v_mac_f32 which always flushes, but if denorms are disabled it is lowered as fma.
So handling it here is correct, and there is a test test_fold_canonicalize_fmuladd_value_f32 for it.


https://reviews.llvm.org/D35218





More information about the llvm-commits mailing list