RFC: Modeling horizontal vector reductions

Wed Sep 11 15:54:32 PDT 2013

On Wed, Sep 11, 2013 at 3:49 PM, Arnold Schwaighofer <
aschwaighofer at apple.com> wrote:

>
> On Sep 11, 2013, at 5:30 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
> >
> > On Wed, Sep 11, 2013 at 3:17 PM, Arnold Schwaighofer <
> aschwaighofer at apple.com> wrote:
> > Therefore, I would like to model horizontal reductions as either
> versions depending on which is deemed cheaper by the cost model.
> >
> > What would make the first pattern cheaper? I'd like to better understand
> why we don't just all ways do the second form…
>
> Less shuffles (because shuffle vec, <0,1, undef, undef> is free) so when
> you don’t have pairwise vector operations the first pattern is preferable.
>

I thought so, but thanks for confirming. I don't trust myself entirely on
the cost models here.

> > It is a bit unfortunate to not have one canonical form but I don’t think
> this justifies adding fast-math flags to isel (which will eventually go
> away).
> >
> > I don't really understand this part.
> >
> > We have some reason at the IR level to know that we can choose either
> association and get equivalent results. Why isn't the correct answer to
> pick a canonical form, but preserve that information long enough to
> reassociate when it is needed?
>
> We want to pick the right form at ISel time - which is too late to
> reassociate.
>
> At some point - before ISel - we have to reasscociate because at ISel time
> we don’t have the fast-math flags that would tell us that it is legal to
> reassociate.
>
> So, we could for example reassociate in CodeGen prepare. We would still
> need an interface to tell us when to do so.

Why don't you want to propagate the flags through isel? That really seems
like the correct long-term solution: that ISel looks at the pattern, knows
that it would be cheaper to use the horizontal instructions and emits the
code that way. It doesn't even have to actually *do* the reassociation, it
can match the reduction pattern and implicitly re-associate by forming the
horizontal instruction pattern. It just needs to know that this is allowed.

You mentioned not wanting to thread these flags through because some of the
machinery is slowly going away, but I think that "slowly" is going to be a
*lot* of time. I think threading the flags through is a much better interim
cost (fixed, no design overhead) than having 2 patterns in the IR for the
same vector operation.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130911/c4c02f0a/attachment.html>