RFC: Modeling horizontal vector reductions

Arnold Schwaighofer aschwaighofer at apple.com
Thu Sep 12 07:00:14 PDT 2013


On Sep 12, 2013, at 8:52 AM, Renato Golin <renato.golin at linaro.org> wrote:

> On 12 September 2013 01:17, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> Say somebody really wrote:
> 
> v0 = 7 * A[i];
> v1 = 7 * A[i+1];
> v2 = 7 * A[i+2];
> v3 = 7 * A[i+3];
> r += (v0 + v1) +  (v2 + v3);
> 
> VS.
> 
> v0 = 7 * A[i];
> v1 = 7 * A[i+1];
> v2 = 7 * A[i+2];
> v3 = 7 * A[i+3];
> r += (v0 + v2) +  (v1 + v3);
> 
> In this case the order dictates which pattern to use. It is just in the fast-math case that the order does not matter.
> 
> Arnold,
> 
> When you compared the two IR pieces in your original email I thought that the second was only more explicit, but they should generate the same machine code in the end, ie. the back-end should see that the instruction is redundant and not even emit it (or remove it afterwards).
> 
> Your example above reinforces it, since the user could write a number of combinations, all of them correct, only some of them redundant, and it'd be a shame to not vectorize most patterns just because they're assumed to be free-or-nothing in the vectorizer.
> 
> So, maybe there could be a DCE pass that would look at "shuffle vec, <0,1, undef, undef>" and know that it's free, and don't emit anything, just alias the register, no?
> 
> Does any of that makes sense?
> 
> cheers,
> --renato
> 
> 
Renato,

this is floating point arithmetic, so without unsafe-math: (v0+v2)+(v1+v3) != (v0 + v1) + (v2 +v3)

So therefore, the two forms don’t compute the same value and you are only allowed to use either one of them interchangeably if you have “fadd fast” or unsafe-math. Problem is when we come to generate code (ISel) we have lost those flags. So my proposal is to but the expression into the form we want before ISel.


The shuffles for the first form (splitting of vectors) is:

(0,1) <- This one is free.
(2,3)

while for the second form (pairwise adds) is
(0,2)
(1,3)


Best,
Arnold





More information about the llvm-commits mailing list