[PATCH] Flag to enable IEEE-754 friendly FP optimizations

Tue May 26 10:38:35 PDT 2015

----- Original Message -----
> From: "Sergey Dmitrouk" <sdmitrouk at accesssoftek.com>
> To: "Stephen Canon" <scanon at apple.com>, "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>, "Owen Anderson" <owen at apple.com>, "Mehdi Amini"
> <mehdi.amini at apple.com>
> Sent: Tuesday, May 26, 2015 12:35:46 PM
> Subject: Re: [PATCH] Flag to enable IEEE-754 friendly FP optimizations
> 
> Hi Steve,
> 
> quick recap for anyone willing to take a look (can be useful to find
> message of the thread with review comments): it was reviewed by Hal
> and
> then Mehdi some time ago, but Mehdi didn't feel qualified enough to
> approve it and Hal didn't say anything about updated set of patches.

Thanks for the reminder; I'll put these back on my TODO list (they fell off somehow), and look at them again soon.

 -Hal

> 
> Thanks,
> Sergey
> 
> On Tue, May 26, 2015 at 09:53:59AM -0700, Stephen Canon wrote:
> > Hi Sergey —
> > 
> > I don’t feel qualified to review all of these patches, but I will
> > speak to the global direction being an important step forward.  If
> > someone feels qualified to review the low-level details of the
> > patches, I think that these should go in.
> > 
> > Thanks for your persistence!
> > – Steve
> > 
> > > On May 26, 2015, at 9:43 AM, Sergey Dmitrouk
> > > <sdmitrouk at accesssoftek.com> wrote:
> > > 
> > > Hi there,
> > > 
> > > Nope, I didn't forget about these changes :)  Ping!
> > > 
> > > Patches rebased onto ToT can be found in the attachment.
> > > 
> > > --
> > > Sergey
> > > 
> > > On Fri, Mar 27, 2015 at 09:13:44AM -0700, Mehdi Amini wrote:
> > >> Hi Sergey,
> > >> 
> > >> I’m sorry that in 6 weeks nobody was able to approve your
> > >> patches despite your pings.
> > >> 
> > >> The code LGTM, but I don’t feel qualified enough to approve the
> > >> global direction where this leads and I rather have someone
> > >> else reviewing the “conceptual” part.
> > >> 
> > >> Add CC Hal, who was involved at the beginning of the thread.
> > >> 
> > >> Best,
> > >> 
> > >> —
> > >> Mehdi
> > >> 
> > >>> On Mar 26, 2015, at 4:28 AM, Sergey Dmitrouk
> > >>> <sdmitrouk at accesssoftek.com> wrote:
> > >>> 
> > >>> Ping.
> > >>> 
> > >>> On Mon, Mar 16, 2015 at 09:55:45PM +0200, Sergey Dmitrouk
> > >>> wrote:
> > >>>> Gentle ping.
> > >>>> 
> > >>>> On Thu, Feb 26, 2015 at 01:33:19PM +0200, Sergey Dmitrouk
> > >>>> wrote:
> > >>>>> Ping.
> > >>>>> 
> > >>>>> On Thu, Feb 19, 2015 at 05:34:26PM +0200, Sergey Dmitrouk
> > >>>>> wrote:
> > >>>>>> Ping.
> > >>>>>> 
> > >>>>>> On Tue, Feb 10, 2015 at 08:12:13PM +0200, Sergey Dmitrouk
> > >>>>>> wrote:
> > >>>>>>> Attached is alternative fix for InstCombine hanging, it
> > >>>>>>> disables only a
> > >>>>>>> couple of conversions of "fadd" into "fsub" to break the
> > >>>>>>> loop.
> > >>>>>>> 
> > >>>>>>> Modifying dyn_castFNegVal() directly causes some unit tests
> > >>>>>>> to fail, so
> > >>>>>>> absence of checks seems to be intensional to enable more
> > >>>>>>> transformations.
> > >>>>>>> As there exist "fsub" to "fadd" transformations and vice
> > >>>>>>> versa, making
> > >>>>>>> some forms as canonical would disable at least one of them.
> > >>>>>>>  So the
> > >>>>>>> patch disables replacements of "fadd" with "fsub" for
> > >>>>>>> "fadd"
> > >>>>>>> instructions marked with floating-point environment access
> > >>>>>>> flags.
> > >>>>>>> 
> > >>>>>>> Regards,
> > >>>>>>> Sergey
> > >>>>>>> 
> > >>>>>>> On Thu, Feb 05, 2015 at 07:56:56PM +0200, Sergey Dmitrouk
> > >>>>>>> wrote:
> > >>>>>>>> Hello,
> > >>>>>>>> 
> > >>>>>>>> Thanks for reviewing this.
> > >>>>>>>> 
> > >>>>>>>> On Wed, Feb 04, 2015 at 10:42:35AM -0800, Mehdi Amini
> > >>>>>>>> wrote:
> > >>>>>>>>> Patch 1:
> > >>>>>>>>> 
> > >>>>>>>>> The comment for:
> > >>>>>>>>> +    unsigned AllowFPExceptAccess : 1;
> > >>>>>>>>> mentions rounding and the comment for:
> > >>>>>>>>> +    unsigned AllowFPRoundAccess : 1;
> > >>>>>>>>> mentions exceptions, they are reversed.
> > >>>>>>>> 
> > >>>>>>>> Updated.
> > >>>>>>>> 
> > >>>>>>>>> Patch 2:
> > >>>>>>>>> 
> > >>>>>>>>> Is kexc or kround flag allowed to be present at the same
> > >>>>>>>>> time as fast?
> > >>>>>>>> 
> > >>>>>>>> Yes, currently they are.  There is no particular reason
> > >>>>>>>> for this except
> > >>>>>>>> for me not being able to decide whether "fast" should
> > >>>>>>>> reset new flags or
> > >>>>>>>> not.  Left it unrelated to each other as they are unlikely
> > >>>>>>>> to be used at
> > >>>>>>>> the same time, ready to change it if you have any
> > >>>>>>>> suggestions.
> > >>>>>>>> 
> > >>>>>>>>> Patch 3:
> > >>>>>>>>> 
> > >>>>>>>>> isSafeToOptimizeFPOp() is not clear to me, especially how
> > >>>>>>>>> TLI->hasFloatingPointExceptions() disable any check,
> > >>>>>>>>> including rounding, while the description of this flag
> > >>>>>>>>> is "Whether the target supports or cares about
> > >>>>>>>>> preserving floating point exception behavior”, which
> > >>>>>>>>> seems orthogonal to
> > >>>>>>>>> getTarget().Options.AllowFPRoundAccess?
> > >>>>>>>>> I assume it is done this way to keep existing behavior?
> > >>>>>>>>> Is it transitional?
> > >>>>>>>> 
> > >>>>>>>> That's a mistake, it was fine until recent change for
> > >>>>>>>> rounding when I
> > >>>>>>>> reordered if statements incorrectly.  Thanks.
> > >>>>>>>> 
> > >>>>>>>>> Patch 4:
> > >>>>>>>>> 
> > >>>>>>>>> isSafeToSpeculativelyExecute(), comment might deserve to
> > >>>>>>>>> be updated , I don’t think it is true anymore with
> > >>>>>>>>> respect to rounding:
> > >>>>>>>>> "Return true if the instruction does not have any effects
> > >>>>>>>>> besides calculating the result and does not have
> > >>>>>>>>> undefined behavior”.
> > >>>>>>>> 
> > >>>>>>>> Updated.
> > >>>>>>>> 
> > >>>>>>>>> canTrap(), again preserving the dynamic “rounding” is
> > >>>>>>>>> included in “trap”? Maybe the comment could be updated
> > >>>>>>>>> because it does not seem obvious to me.
> > >>>>>>>> 
> > >>>>>>>> Actually, rounding flag wasn't even used and exceptions
> > >>>>>>>> flag was
> > >>>>>>>> unnecessary, so I just reverted that part.  It was
> > >>>>>>>> required with flags
> > >>>>>>>> being detached from instructions, but now that they are
> > >>>>>>>> part of
> > >>>>>>>> fast-math flags checking constant expressions should be
> > >>>>>>>> no-operation
> > >>>>>>>> (operations that go against flags must not be converted to
> > >>>>>>>> constant
> > >>>>>>>> expressions).
> > >>>>>>>> 
> > >>>>>>>>> Patch 10: if you really want to test it, maybe the C++
> > >>>>>>>>> unittests can do the job?
> > >>>>>>>> 
> > >>>>>>>> Thanks, added a unit-test and a correction for "frem"
> > >>>>>>>> instruction.
> > >>>>>>>> 
> > >>>>>>>>> Patch 12: haven’t look into detail where the infinite
> > >>>>>>>>> loop occurs, but it seems like a hammer fix for the
> > >>>>>>>>> issue. The example mentioned in the test ((a + b) => (a
> > >>>>>>>>> - (-b)) => (a + b) => etc) makes me thinks that one of
> > >>>>>>>>> the two forms should be canonical and InstCombine should
> > >>>>>>>>> not make one of these two transforms. What do you think?
> > >>>>>>>> 
> > >>>>>>>> I chose this way as it should guarantee to work for all
> > >>>>>>>> cases, but you
> > >>>>>>>> might be right in that something is wrong with pattern
> > >>>>>>>> matching in
> > >>>>>>>> InstCombine, see below.
> > >>>>>>>> 
> > >>>>>>>> There is an implicit four step loop between
> > >>>>>>>> InstCombiner::visitFAdd() and
> > >>>>>>>> InstCombiner::visitFSub():
> > >>>>>>>> 
> > >>>>>>>> 1. FAdd: Convert A to -A.
> > >>>>>>>> 2. FSub: Convert -A to A; switch operands.
> > >>>>>>>> 3. FAdd: Convert B to -B.
> > >>>>>>>> 4. FSub: Convert -B to B; switch operands.
> > >>>>>>>> 5. Go to step 1.
> > >>>>>>>> 
> > >>>>>>>> Both functions try to do something about negative values,
> > >>>>>>>> but they
> > >>>>>>>> actually hang in case of positive arguments as well.  I
> > >>>>>>>> couldn't find
> > >>>>>>>> any checks whether operands are negative, it seems to be
> > >>>>>>>> "assumed".
> > >>>>>>>> Maybe dyn_castFNegVal() used to perform the check, but it
> > >>>>>>>> doesn't anymore.
> > >>>>>>>> 
> > >>>>>>>> If I add `C->isNegative()` check to dyn_castFNegVal(), the
> > >>>>>>>> issue
> > >>>>>>>> disappears, but I'm not sure if it's the right place.
> > >>>>>>>>  Meaning of the
> > >>>>>>>> comment there:
> > >>>>>>>> 
> > >>>>>>>> // Constants can be considered to be negated values if
> > >>>>>>>> they can be
> > >>>>>>>> // folded.
> > >>>>>>>> if (ConstantFP *C = dyn_cast<ConstantFP>(V))
> > >>>>>>>>   return ConstantExpr::getFNeg(C);
> > >>>>>>>> 
> > >>>>>>>> is unclear to me.  Does it suggest that the function
> > >>>>>>>> doesn't care about
> > >>>>>>>> sign of the value?  In this case, checks for negative
> > >>>>>>>> operands should be
> > >>>>>>>> performed outside (in visitFAdd() and visitFSub()
> > >>>>>>>> functions).
> > >>>>>>>> 
> > >>>>>>>> Updated patches are attached (0000-recent-changes.diff
> > >>>>>>>> contains latest
> > >>>>>>>> changes only).  There is nothing for InstCombine yet and a
> > >>>>>>>> small new
> > >>>>>>>> change (not counting test file) for LICM pass.
> > >>>>>>>> 
> > >>>>>>>> Thanks,
> > >>>>>>>> Sergey
> > > <0001-Add-flags-and-command-line-switches-for-FPEnv.patch><0002-Add-FPEnv-access-flags-to-fast-math-flags.patch><0003-Consider-FPEnv-access-in-SelectionDAG.patch><0004-Skip-constant-folding-to-preserve-FPEnv.patch><0005-Teach-IR-builder-and-folders-about-new-flags.patch><0006-Do-not-fold-constants-on-reading-in-IR-asm-bitcode.patch><0007-Prevent-undesired-folding-by-InstSimplify.patch><0008-Do-not-simplify-expressions-with-FPEnv-access.patch><0009-Make-Strict-flag-available-for-more-clients.patch><0010-Use-Strict-in-IRBuilder.patch><0011-Don-t-convert-fpops-to-constexprs-in-SCCP.patch><0012-Prevent-InstCombine-from-hanging.patch><0013-Don-t-hoist-FP-ops-with-side-effects-in-LICM.patch>
> > 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory