[PATCH] D35218: [AMDGPU] fcanonicalize elimination optimization

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 10 13:10:28 PDT 2017


rampitec added inline comments.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4638
+
+  case ISD::FCANONICALIZE:
+    return true;
----------------
arsenm wrote:
> rampitec wrote:
> > rampitec wrote:
> > > arsenm wrote:
> > > > fcanonicalize should have the idempotent optimization applied so there shouldn't be a need to handle it
> > > It is not:
> > > ```
> > >         v_mul_f32_e32 v2, 1.0, v2
> > >         v_mul_f32_e32 v2, 1.0, v2
> > > ```
> > And I do not think there are idempotent SDNodes.
> I don't think it's done now, but fcanonicalize (fcanonicalize x) should fold to just one canonicalize
...and that is what I am doing here.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4665-4668
+  // In pre-GFX9 targets V_MIN_F32 and others do not flush denorms.
+  // For such targets need to check their input recursively.
+  case ISD::FMINNUM:
+  case ISD::FMAXNUM:
----------------
arsenm wrote:
> rampitec wrote:
> > arsenm wrote:
> > > The output is never flushed anywhere?
> > GFX9 flushes.
> That is broken. If that is the case we probably shouldn't be using the regular minnum/maxnum intrinsics without denormals, and then it's an optimization to fold canonicalize (min/max) to the weird target behavior.
The library is common for different targets.


================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4671-4672
+  case ISD::FMAXNAN:
+    if (ST->getGeneration() >= SISubtarget::GFX9)
+      return true;
+
----------------
arsenm wrote:
> rampitec wrote:
> > arsenm wrote:
> > > I don't think this is true, but should have a named check in the subtarget
> > I would rather think about denorm support flag in TD for every single instruction wrt subtarget. Why add just a single one?
> We don't need a full fledged subtarget feature, just put the generation check in a function with name/description rather than adding more random looking generation checks
I'm not sure I follow. Could you please describe a name of such check?


https://reviews.llvm.org/D35218





More information about the llvm-commits mailing list