[PATCH] D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management.

Hal Finkel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 21 12:31:59 PDT 2017


hfinkel added a comment.

In https://reviews.llvm.org/D37686#877960, @spatel wrote:

> In https://reviews.llvm.org/D37686#877951, @jbhateja wrote:
>
> > In https://reviews.llvm.org/D37686#877864, @spatel wrote:
> >
> > > I've added some more FMF tests at https://reviews.llvm.org/rL313893 which I think this patch will miscompile. Please rebase/update.
> > >
> > > As I suggested before, this patch shouldn't try to enable multiple DAG combines with node-level FMF. It's not as straightforward as you might think.
> > >
> > > Pick exactly one combine if you want to show that this patch is working as intended. The llvm.muladd intrinsic test that you have here with a target that supports 'fma' (not plain x86) seems like a good choice to me. If we have a strict op in IR, it should produce an fma instruction. If we have a fast op in IR, it should produce the simpler fmul instruction?
> >
> >
> > My understanding and code changes are based LLVM Ref Manual 's section about Fast-Math flags"  (http://llvm.org/docs/LangRef.html#fast-math-flags)
> >
> > Which say for FMF flag NaN  "Allow optimizations to assume the arguments and result are not NaN".
> >
> > Now in following case which has been added by you
> >
> >   %y = call float @llvm.sqrt.f32(float %x)
> >   %z = fdiv fast float 1.0, %y
> >   ret float %z
> >   
> >
> > We dont have fast flag over intrinsic but DAGCombining for fdiv sees a fast flag and assume result (%z) and arguments (constant , %y) as not a Nan and goes ahead and generates a reciprocal sqrt. If you remove fast from fdiv and add it to intrinsic then FMF opt at fdiv will not kick in.
> >
> > Can you please let me know what you expected here.
>
>
> I expect that the sqrt result is strict. Ie, it should use sqrtss if this is x86-64. We're not allowed to use rsqrtss and lose precision on that op.


I think there's a good argument either way here...

While `fast` does imply `nnan`, and `nnan` does propagate backward, and it also implies `arcp`, and `arcp` does not propagate backward. `arcp` applied only to the instruction to which it's attached. In this case,  we're allowed to use a reciprocal approximation to the division, and not the sqrt. However, we could argue that using the rsqrt is like doing the sqrt exactly and then just approximating the division. If there are no other users of the sqrt itself, there seems to be little semantic difference.

> That said, my memory of exactly how op-level FMF should work is fuzzy. If anyone else remembers or can link to threads where we've discussed this, please feel free to jump in. :)




https://reviews.llvm.org/D37686





More information about the llvm-commits mailing list