[PATCH] propagate IR-level fast-math-flags to DAG nodes, disabled by default

Hal Finkel hfinkel at anl.gov
Wed Aug 5 09:58:25 PDT 2015


----- Original Message -----
> From: "Sanjay Patel" <spatel at rotateright.com>
> To: "escha" <escha at apple.com>, llvm-commits at lists.llvm.org
> Sent: Wednesday, August 5, 2015 10:49:43 AM
> Subject: Re: [PATCH] propagate IR-level fast-math-flags to DAG nodes, disabled by default
> 
> 
> [Using new mailing list. Several of the previous mails in this thread
> are probably lost forever for the lists; see below for context.]
> 
> I don't think we should leave CSE in the DAG broken while figuring
> out how to solve this,

Is there really any disagreement here? It seems like someone needs to go through and make sure that all relevant combines propagate the flags. All other options proposed were simply suggestions for short-term workarounds, no?

 -Hal

> so I reverted r243687 with:
> http://llvm.org/viewvc/llvm-project?view=revision&revision=244053
> 
> 
> Also filed a bug using the example code below:
> https://llvm.org/bugs/show_bug.cgi?id=24363
> 
> 
> So we may get at least one in-tree regression test before the flag
> gets enabled again. :)
> 
> 
> 
> On Tue, Aug 4, 2015 at 5:39 PM, Sanjay Patel < spatel at rotateright.com
> > wrote:
> 
> 
> 
> Thanks - your example made it much clearer to me how this goes wrong.
> I think I should revert r243687 and try to fix everything under the
> cover of that scaffolding...
> 
> I don't know how we'd test it all though. Here's an example that
> shows the bug for one spot in DAGCombiner's GetNegatedExpression()
> with an x86 target:
> 
> $ cat badflags.ll
> define float @fmf(float %a, float %b) {
> %mul1 = fmul fast float %a, %b
> %nega = fsub fast float 0.0, %a
> %mul2 = fmul fast float %nega, %b
> %abx2 = fsub fast float %mul1, %mul2
> ret float %abx2
> }
> 
> 
> $ ./llc -o - badflags.ll -mattr=fma -enable-unsafe-fp-math
> -enable-fmf-dag=0
> ...
> vmulss %xmm1, %xmm0, %xmm0
> vaddss %xmm0, %xmm0, %xmm0
> retq
> 
> $ ./llc -o - badflags.ll -mattr=fma -enable-unsafe-fp-math
> -enable-fmf-dag=1
> ...
> vmulss %xmm1, %xmm0, %xmm2
> vfmadd213ss %xmm2, %xmm1, %xmm0 <--- failed to recognize that (a * b)
> was already calculated
> retq
> 
> 
> 
> 
> 
> 
> On Tue, Aug 4, 2015 at 12:03 PM, escha < escha at apple.com > wrote:
> 
> 
> 
> Unfortunately, no :/ though I can imagine one could be constructed in
> *some* fashion, since surely crippled CSE is visible somehow
> somewhere. The particular case that came up looked like this:
> 
> 
> c = a * b FAST
> d = -c FAST
> 
> 
> (combine1 occurs)
> 
> 
> c = a * b FAST
> d = a * -b
> 
> 
> then we have a combine in combine2, in our target, to deduplicate
> these, which should result in this:
> 
> 
> 
> c = a * b FAST
> d = -c FAST
> 
> 
> which we do because negates are free (modifiers on a GPU) while an
> extra multiply is not.
> 
> 
> it visits d = a * -b and then searches the DAG to see if (a * b)
> exists. But it doesn’t, only (a * b FAST) exists, the search fails,
> and the code is pessimized.
> 
> 
> —escha
> 
> 
> 
> 
> 
> 
> 
> On Aug 4, 2015, at 10:57 AM, Sanjay Patel < spatel at rotateright.com >
> wrote:
> 
> 
> Thanks! Do you have a test case that would show a difference from
> that kind of change on an in-tree backend?
> 
> 
> 
> 
> On Tue, Aug 4, 2015 at 11:34 AM, escha < escha at apple.com > wrote:
> 
> 
> 
> I’ve only done the following patch, just as an experiment; I haven’t
> done much work on it overall (I was just experimenting with trying
> to kill off this individual regression).
> 
> 
> 
> 
> 
> 
> 
> 
> 
> —escha
> 
> 
> 
> 
> 
> On Aug 4, 2015, at 10:07 AM, Sanjay Patel < spatel at rotateright.com >
> wrote:
> 
> 
> 
> Ok, that's good news. The other good news is that nobody has reported
> crashes from this change...yet.
> 
> I'm fine proceeding either way (revert r243687 or just start patching
> as-is). Let me know what you'd prefer.
> 
> Can you commit any patches you've already done for the generic DAG
> classes?
> 
> 
> 
> 
> 
> 
> On Tue, Aug 4, 2015 at 10:47 AM, escha < escha at apple.com > wrote:
> 
> 
> 
> I have a patch to add propagation to our own backend, by the way; it
> wasn’t at all difficult to write (it took like 20 minutes). But if
> we keep the patch in, we should probably put out an announcement to
> backend maintainers that they need to add fast-math flag
> propagation.
> 
> 
> —escha
> 
> 
> 
> 
> 
> 
> 
> On Aug 4, 2015, at 8:28 AM, escha < escha at apple.com > wrote:
> 
> 
> 
> The “unsafe FP math” transforms aren’t enough; the problem is that
> even the *safe-math* transforms need to propagate flags from their
> source nodes, or you end up with un-CSEable duplicates (and of
> course, lose flags for future transformations!). You need to pass
> flags in for *every single creation of an FADD, FSUB, FADD, or
> FDIV*, as far as I can tell. This involves a huge amount of diffs
> that look like this:
> 
> 
> 
> // fold (fneg (fsub A, B)) -> (fsub B, A)
> return DAG.getNode(ISD::FSUB, SDLoc(Op), Op.getValueType(),
> - Op.getOperand(1), Op.getOperand(0));
> + Op.getOperand(1), Op.getOperand(0),
> + &cast<BinaryWithFlagsSDNode>(Op)->Flags);
> 
> 
> One idea I had was to allow conservative CSE. That is, allow CSEing
> nodes with fast-math flags, even if their flags differ — just, when
> CSEing them, pick the intersection of the two flags. This will
> guarantee that CSE will still function, even if DAG folds don’t
> properly maintain fast-math flags. But we still need to eventually
> add all these flag propagations; otherwise having fast-math flags in
> the DAG is useless at best, isn’t it?
> 
> 
> —escha
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Aug 4, 2015, at 7:26 AM, Sanjay Patel < spatel at rotateright.com >
> wrote:
> 
> 
> 
> 
> Does anyone see a workaround to do this piecemeal with FMF enabled?
> 
> Otherwise, I see about 30 checks for UnsafeFPMath transforms in
> DAGCombiner...so I could revert r243687, start working on the
> combiner fixes, and then turn the flag back on when DAGCombiner is
> FMF-clean.
> 
> But this wouldn't be perfect if you have more UnsafeFPMath transforms
> in your backend. You'd have to patch those too before we turn the
> flag back on.
> 
> Any other ideas about how to proceed?
> 
> 
> 
> On Mon, Aug 3, 2015 at 4:52 PM, escha < escha at apple.com > wrote:
> 
> 
> 
> We’re getting significant CSE regressions on our backend, caused by
> the fact that none of the DAG folds propagate fast-math flags. This
> results in nodes not being CSE’d, since we only CSE nodes if they
> have identical flags (and thus worse codegen).
> 
> 
> I was able to get rid of many of them by adding flag propagation to
> FNEG folding, but there’s way more than just that.
> 
> 
> 
> 
> —escha
> 
> 
> 
> 
> 
> 
> 
> On Jul 29, 2015, at 11:51 AM, Sanjay Patel < spatel at rotateright.com >
> wrote:
> 
> 
> 
> 
> For reference, fixes for PR24141 are in:
> http://reviews.llvm.org/rL243293 http://reviews.llvm.org/rL243498
> http://reviews.llvm.org/rL243500 ...assuming those don't cause new
> breakage, it's finally, really time to flip the FMF flag...and wait
> for more breakage.
> 
> 
> 
> On Mon, Jul 20, 2015 at 11:34 AM, Sanjay Patel <
> spatel at rotateright.com > wrote:
> 
> 
> 
> 
> 
> 
> 
> On Fri, Jul 17, 2015 at 4:42 PM, Nick Lewycky < nlewycky at google.com >
> wrote:
> 
> 
> 
> 
> 
> On 9 July 2015 at 15:27, Sanjay Patel < spatel at rotateright.com >
> wrote:
> 
> 
> 
> 
> Hi Nick,
> 
> The reciprocal fix is in at r241826. Please let me know when you find
> another failure. Thanks!
> 
> 
> 
> Thank you! I tested it and found exactly nothing wrong. Hopefully I
> didn't goof the testing, but I think it's time to flip the flag to
> true, then wait a few days before removing the flag entirely.
> 
> 
> 
> Wow...I'm surprised, but I'll hope for the best. Thanks for the
> testing!
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Note however, that we still have problems with PR24141 which is
> related to unsafe fp math, but doesn't change with the
> -enable-fmf-dag flag.
> 
> Proposed fix:
> http://reviews.llvm.org/D11345
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-commits mailing list