[PATCH] D37686: [DAG] Consolidating Instruction->SDNode Flags propagation in one class for better code management.

Sanjay Patel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 21 12:34:30 PDT 2017


spatel added a comment.

In https://reviews.llvm.org/D37686#877972, @jbhateja wrote:

> In https://reviews.llvm.org/D37686#877960, @spatel wrote:
>
> > In https://reviews.llvm.org/D37686#877951, @jbhateja wrote:
> >
> > > In https://reviews.llvm.org/D37686#877864, @spatel wrote:
> > >
> > > > I've added some more FMF tests at https://reviews.llvm.org/rL313893 which I think this patch will miscompile. Please rebase/update.
> > > >
> > > > As I suggested before, this patch shouldn't try to enable multiple DAG combines with node-level FMF. It's not as straightforward as you might think.
> > > >
> > > > Pick exactly one combine if you want to show that this patch is working as intended. The llvm.muladd intrinsic test that you have here with a target that supports 'fma' (not plain x86) seems like a good choice to me. If we have a strict op in IR, it should produce an fma instruction. If we have a fast op in IR, it should produce the simpler fmul instruction?
> > >
> > >
> > > My understanding and code changes are based LLVM Ref Manual 's section about Fast-Math flags"  (http://llvm.org/docs/LangRef.html#fast-math-flags)
> > >
> > > Which say for FMF flag NaN  "Allow optimizations to assume the arguments and result are not NaN".
> > >
> > > Now in following case which has been added by you
> > >
> > >   %y = call float @llvm.sqrt.f32(float %x)
> > >   %z = fdiv fast float 1.0, %y
> > >   ret float %z
> > >   
> > >
> > > We dont have fast flag over intrinsic but DAGCombining for fdiv sees a fast flag and assume result (%z) and arguments (constant , %y) as not a Nan and goes ahead and generates a reciprocal sqrt. If you remove fast from fdiv and add it to intrinsic then FMF opt at fdiv will not kick in.
> > >
> > > Can you please let me know what you expected here.
> >
> >
> > I expect that the sqrt result is strict. Ie, it should use sqrtss if this is x86-64. We're not allowed to use rsqrtss and lose precision on that op.
> >
> > That said, my memory of exactly how op-level FMF should work is fuzzy. If anyone else remembers or can link to threads where we've discussed this, please feel free to jump in. :)
>
>
> Exactly, that is why I added a routine to get Unified flags which intersects flags of a node with flags of its operands in the earlier version of this patch , i think it will be right to inject that code with this patch [it was removed from current version of patch as per your comments]


I think you're mixing up flag propagation as it applies to creation of nodes with flag combining when folding operations. These are 2 different things. This patch is about propagating from IR to nodes at creation time (at least that's what I think it should be limited to based on the title).


https://reviews.llvm.org/D37686





More information about the llvm-commits mailing list