[PATCH] Allow FMAs in safe math mode in some cases when one operand of the fmul is either exactly 0.0 or exactly 1.0.

Hal Finkel hfinkel at anl.gov
Tue Jul 9 15:16:50 PDT 2013


----- Original Message -----
> On Tue, Jul 9, 2013 at 3:03 PM, Stephen Lin <swlin at post.harvard.edu>
> wrote:
> >>> I ended up making it work on vectors on constrained cases (since
> >>> the
> >>> vector case was tested for when the function was recursive), but
> >>> it
> >>> tools a lot of thinking to convince myself I hadn't made a
> >>> mistake.
> >>>
> >>> Do you think the vector case will come up enough to make this
> >>> worthwhile? The problem is that the vector has to be
> >>> transparently
> >>> defined in the same basic block as the fmul and fadd for it to
> >>> work,
> >>> and most I think most vectors will be created using some kind of
> >>> control flow that will necessarily span more than one basic
> >>> block.
> >>
> >> I'm thinking of something like this: autovectorize this:
> >> for (...) {
> >>   a[i] = 1.0 + b[i]*c[i];
> >> }
> >> and you'll get a uniform vector of 1.0. I think that this is not
> >> uncommon.
> >>
> 
> By the way, did you really mean "a[i] = 1.0 + b[i]*c[i];"? The cases
> this patch handles are ones where the 0.0 or 1.0 is on one of the
> fmul
> operands.
> 
> The following case might matter though:
> 
>     // f is a scalar boolean...
> 
>     for (...) {
>       a[i] = b[i] + f*c[i];
>     }
> 
> I think in this case a boolean vector of all 1's or 0's might be
> formed, but it would most likely be formed outside of the loop so
> SelectionDAG would not be able to see through it.

Good point.

 -Hal

> 
> >
> > I would like to be able to handle this case and others like it but
> > I'm
> > not sure if it will work even if I make DAGCombiner aware of
> > vectors,
> > because the DAGCombiner cannot examine basic blocks other than the
> > one
> > currently being processed. It depends on how the IR passes decide
> > to
> > arrange the code. Basically, SelectionDAG has to have enough
> > information to prove that a vector is all 0's or 1's just by
> > examining
> > SDNodes within a single basic block, which I'm not sure will work
> > in
> > this case.
> >
> > In any case, I think that should be a separate patch. I'll add more
> > checks for the scalar case and update this one first.
> >
> > Thanks,
> > Stephen
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list