[PATCH] propagate IR-level fast-math-flags to DAG nodes, disabled by default

Hal Finkel hfinkel at anl.gov
Wed Aug 5 10:27:10 PDT 2015


----- Original Message -----
> From: "Sanjay Patel" <spatel at rotateright.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "escha" <escha at apple.com>, llvm-commits at lists.llvm.org
> Sent: Wednesday, August 5, 2015 12:17:50 PM
> Subject: Re: [PATCH] propagate IR-level fast-math-flags to DAG nodes, disabled by default
> 
> > Is there really any disagreement here? It seems like someone needs
> > to go through and make sure that all relevant combines propagate
> > the flags. All other > options proposed were simply suggestions
> > for short-term workarounds, no?
> 
> Correct - I don't think there's any disagreement. Hacking CSE might
> provide a bandage solution, but, as Escha noted, it would defeat the
> point of pushing FMF into the DAG in the first place.
> 
> I can go through and fix everything in-tree, but I'm wondering:
> 1. Does every change need a regression test?

That would be wonderful, but I doubt it is practical. We'll take "by inspection" auditing changes without individual regression tests per line changed.

> 2. How do we alert out-of-tree backends about the change? Is a note
> to llvm-dev good enough, or should there be an assert in the code if
> someone creates one of the FMF-capable nodes without explicitly
> specifying the flags?

Send a message to llvm-dev. I don't see any reason to break working out-of-tree code for this; it is an optimization issue, not a correctness issue.

 -Hal

> 
> On Wed, Aug 5, 2015 at 10:58 AM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> ----- Original Message -----
> > From: "Sanjay Patel" < spatel at rotateright.com >
> > To: "escha" < escha at apple.com >, llvm-commits at lists.llvm.org
> > Sent: Wednesday, August 5, 2015 10:49:43 AM
> > Subject: Re: [PATCH] propagate IR-level fast-math-flags to DAG
> > nodes, disabled by default
> > 
> > 
> > [Using new mailing list. Several of the previous mails in this
> > thread
> > are probably lost forever for the lists; see below for context.]
> > 
> > I don't think we should leave CSE in the DAG broken while figuring
> > out how to solve this,
> 
> Is there really any disagreement here? It seems like someone needs to
> go through and make sure that all relevant combines propagate the
> flags. All other options proposed were simply suggestions for
> short-term workarounds, no?
> 
> -Hal
> 
> 
> 
> > so I reverted r243687 with:
> > http://llvm.org/viewvc/llvm-project?view=revision&revision=244053
> > 
> > 
> > Also filed a bug using the example code below:
> > https://llvm.org/bugs/show_bug.cgi?id=24363
> > 
> > 
> > So we may get at least one in-tree regression test before the flag
> > gets enabled again. :)
> > 
> > 
> > 
> > On Tue, Aug 4, 2015 at 5:39 PM, Sanjay Patel <
> > spatel at rotateright.com
> > > wrote:
> > 
> > 
> > 
> > Thanks - your example made it much clearer to me how this goes
> > wrong.
> > I think I should revert r243687 and try to fix everything under the
> > cover of that scaffolding...
> > 
> > I don't know how we'd test it all though. Here's an example that
> > shows the bug for one spot in DAGCombiner's GetNegatedExpression()
> > with an x86 target:
> > 
> > $ cat badflags.ll
> > define float @fmf(float %a, float %b) {
> > %mul1 = fmul fast float %a, %b
> > %nega = fsub fast float 0.0, %a
> > %mul2 = fmul fast float %nega, %b
> > %abx2 = fsub fast float %mul1, %mul2
> > ret float %abx2
> > }
> > 
> > 
> > $ ./llc -o - badflags.ll -mattr=fma -enable-unsafe-fp-math
> > -enable-fmf-dag=0
> > ...
> > vmulss %xmm1, %xmm0, %xmm0
> > vaddss %xmm0, %xmm0, %xmm0
> > retq
> > 
> > $ ./llc -o - badflags.ll -mattr=fma -enable-unsafe-fp-math
> > -enable-fmf-dag=1
> > ...
> > vmulss %xmm1, %xmm0, %xmm2
> > vfmadd213ss %xmm2, %xmm1, %xmm0 <--- failed to recognize that (a *
> > b)
> > was already calculated
> > retq
> > 
> > 
> > 
> > 
> > 
> > 
> > On Tue, Aug 4, 2015 at 12:03 PM, escha < escha at apple.com > wrote:
> > 
> > 
> > 
> > Unfortunately, no :/ though I can imagine one could be constructed
> > in
> > *some* fashion, since surely crippled CSE is visible somehow
> > somewhere. The particular case that came up looked like this:
> > 
> > 
> > c = a * b FAST
> > d = -c FAST
> > 
> > 
> > (combine1 occurs)
> > 
> > 
> > c = a * b FAST
> > d = a * -b
> > 
> > 
> > then we have a combine in combine2, in our target, to deduplicate
> > these, which should result in this:
> > 
> > 
> > 
> > c = a * b FAST
> > d = -c FAST
> > 
> > 
> > which we do because negates are free (modifiers on a GPU) while an
> > extra multiply is not.
> > 
> > 
> > it visits d = a * -b and then searches the DAG to see if (a * b)
> > exists. But it doesn’t, only (a * b FAST) exists, the search fails,
> > and the code is pessimized.
> > 
> > 
> > —escha
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Aug 4, 2015, at 10:57 AM, Sanjay Patel < spatel at rotateright.com
> > >
> > wrote:
> > 
> > 
> > Thanks! Do you have a test case that would show a difference from
> > that kind of change on an in-tree backend?
> > 
> > 
> > 
> > 
> > On Tue, Aug 4, 2015 at 11:34 AM, escha < escha at apple.com > wrote:
> > 
> > 
> > 
> > I’ve only done the following patch, just as an experiment; I
> > haven’t
> > done much work on it overall (I was just experimenting with trying
> > to kill off this individual regression).
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > —escha
> > 
> > 
> > 
> > 
> > 
> > On Aug 4, 2015, at 10:07 AM, Sanjay Patel < spatel at rotateright.com
> > >
> > wrote:
> > 
> > 
> > 
> > Ok, that's good news. The other good news is that nobody has
> > reported
> > crashes from this change...yet.
> > 
> > I'm fine proceeding either way (revert r243687 or just start
> > patching
> > as-is). Let me know what you'd prefer.
> > 
> > Can you commit any patches you've already done for the generic DAG
> > classes?
> > 
> > 
> > 
> > 
> > 
> > 
> > On Tue, Aug 4, 2015 at 10:47 AM, escha < escha at apple.com > wrote:
> > 
> > 
> > 
> > I have a patch to add propagation to our own backend, by the way;
> > it
> > wasn’t at all difficult to write (it took like 20 minutes). But if
> > we keep the patch in, we should probably put out an announcement to
> > backend maintainers that they need to add fast-math flag
> > propagation.
> > 
> > 
> > —escha
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Aug 4, 2015, at 8:28 AM, escha < escha at apple.com > wrote:
> > 
> > 
> > 
> > The “unsafe FP math” transforms aren’t enough; the problem is that
> > even the *safe-math* transforms need to propagate flags from their
> > source nodes, or you end up with un-CSEable duplicates (and of
> > course, lose flags for future transformations!). You need to pass
> > flags in for *every single creation of an FADD, FSUB, FADD, or
> > FDIV*, as far as I can tell. This involves a huge amount of diffs
> > that look like this:
> > 
> > 
> > 
> > // fold (fneg (fsub A, B)) -> (fsub B, A)
> > return DAG.getNode(ISD::FSUB, SDLoc(Op), Op.getValueType(),
> > - Op.getOperand(1), Op.getOperand(0));
> > + Op.getOperand(1), Op.getOperand(0),
> > + &cast<BinaryWithFlagsSDNode>(Op)->Flags);
> > 
> > 
> > One idea I had was to allow conservative CSE. That is, allow CSEing
> > nodes with fast-math flags, even if their flags differ — just, when
> > CSEing them, pick the intersection of the two flags. This will
> > guarantee that CSE will still function, even if DAG folds don’t
> > properly maintain fast-math flags. But we still need to eventually
> > add all these flag propagations; otherwise having fast-math flags
> > in
> > the DAG is useless at best, isn’t it?
> > 
> > 
> > —escha
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Aug 4, 2015, at 7:26 AM, Sanjay Patel < spatel at rotateright.com >
> > wrote:
> > 
> > 
> > 
> > 
> > Does anyone see a workaround to do this piecemeal with FMF enabled?
> > 
> > Otherwise, I see about 30 checks for UnsafeFPMath transforms in
> > DAGCombiner...so I could revert r243687, start working on the
> > combiner fixes, and then turn the flag back on when DAGCombiner is
> > FMF-clean.
> > 
> > But this wouldn't be perfect if you have more UnsafeFPMath
> > transforms
> > in your backend. You'd have to patch those too before we turn the
> > flag back on.
> > 
> > Any other ideas about how to proceed?
> > 
> > 
> > 
> > On Mon, Aug 3, 2015 at 4:52 PM, escha < escha at apple.com > wrote:
> > 
> > 
> > 
> > We’re getting significant CSE regressions on our backend, caused by
> > the fact that none of the DAG folds propagate fast-math flags. This
> > results in nodes not being CSE’d, since we only CSE nodes if they
> > have identical flags (and thus worse codegen).
> > 
> > 
> > I was able to get rid of many of them by adding flag propagation to
> > FNEG folding, but there’s way more than just that.
> > 
> > 
> > 
> > 
> > —escha
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Jul 29, 2015, at 11:51 AM, Sanjay Patel < spatel at rotateright.com
> > >
> > wrote:
> > 
> > 
> > 
> > 
> > For reference, fixes for PR24141 are in:
> > http://reviews.llvm.org/rL243293 http://reviews.llvm.org/rL243498
> > http://reviews.llvm.org/rL243500 ...assuming those don't cause new
> > breakage, it's finally, really time to flip the FMF flag...and wait
> > for more breakage.
> > 
> > 
> > 
> > On Mon, Jul 20, 2015 at 11:34 AM, Sanjay Patel <
> > spatel at rotateright.com > wrote:
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Fri, Jul 17, 2015 at 4:42 PM, Nick Lewycky < nlewycky at google.com
> > >
> > wrote:
> > 
> > 
> > 
> > 
> > 
> > On 9 July 2015 at 15:27, Sanjay Patel < spatel at rotateright.com >
> > wrote:
> > 
> > 
> > 
> > 
> > Hi Nick,
> > 
> > The reciprocal fix is in at r241826. Please let me know when you
> > find
> > another failure. Thanks!
> > 
> > 
> > 
> > Thank you! I tested it and found exactly nothing wrong. Hopefully I
> > didn't goof the testing, but I think it's time to flip the flag to
> > true, then wait a few days before removing the flag entirely.
> > 
> > 
> > 
> > Wow...I'm surprised, but I'll hope for the best. Thanks for the
> > testing!
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Note however, that we still have problems with PR24141 which is
> > related to unsafe fp math, but doesn't change with the
> > -enable-fmf-dag flag.
> > 
> > Proposed fix:
> > http://reviews.llvm.org/D11345
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > 
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-commits mailing list