[PATCH] D35218: [AMDGPU] fcanonicalize elimination optimization
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 11 10:59:02 PDT 2017
rampitec marked 12 inline comments as done.
rampitec added inline comments.
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:4665-4668
+ // In pre-GFX9 targets V_MIN_F32 and others do not flush denorms.
+ // For such targets need to check their input recursively.
+ case ISD::FMINNUM:
+ case ISD::FMAXNUM:
----------------
rampitec wrote:
> arsenm wrote:
> > rampitec wrote:
> > > arsenm wrote:
> > > > rampitec wrote:
> > > > > arsenm wrote:
> > > > > > The output is never flushed anywhere?
> > > > > GFX9 flushes.
> > > > That is broken. If that is the case we probably shouldn't be using the regular minnum/maxnum intrinsics without denormals, and then it's an optimization to fold canonicalize (min/max) to the weird target behavior.
> > > The library is common for different targets.
> > We need to fix that then. If it's not returning exactly one of the inputs it's not implementing the IEEE min/max
> In fact IEEE754-2008 (as described for maxnum/minnum llvm ir) tells to return canonicalized numbers, so it is the opposite: GFX9 is finally IEEE compliant.
This part is split-off the patch.
https://reviews.llvm.org/D35218
More information about the llvm-commits
mailing list