[LLVMdev] Horizontal ADD across single vector not profitable in SLP Vectorization

suyog sarda sardask01 at gmail.com
Fri Nov 28 23:59:52 PST 2014


>
>
> Have a look at the code in HorizontalReduction::getReductionCost and
> HorizontalReduction::emitReduction.
>
> You don't need 4 extracts. This can be modeled at the IR level as a
> combination of shufflevector and vector add instruction on a <4 x i32>
> vector. TargetTransformInfo::getReductionCost can return the appropriate
> cost (for example, one for AArch64::getReductionCost(add, <4 x i32>)) if
> codegen can implement this sequence of instructions more efficiently.
>
> For a <4 x i32> reduction you need only need two vector shuffles, two
> vector adds and one vector extract to get the scalar result.
>
> vadd <0, 1, 2, 3>
>          <2, 3, x, x> // shuffled
> =>
>
> <0+2, 1+3, x, x>
>
>
> vadd <0+2, 1+3, x x>
>          <1+3, x, x x> // shuffled
> =>
>
> <0+2+1+3, x, x, x>
>

Ahh!! Shuffle vector comes to the rescue. Thanks Arnold for pointing out. I
ignored it completely in above explanation.


>
> What it takes to get your example working in the SLPVectorizer is:
>
> * Get the matching code up to snuff. I think, we should replace the depth
> first search matcher by explicitly matching the trees we expect in
> HorizontalReduction::matchReduction. The code should just look for:
>
>    (+ (+ (+ v1 v2) v3) v4)
>     and maybe
>     (+ ( + v1 v2) (+ v3 v4))
>
>     explicitly for v1, .., vn identical operations.
>
> * Allow a tree of size of one (the vector loads) if the tree feeds a
> reduction.
>
> * Adjust the cost model AArch64::getReductionCost
>
> * AArch64 CodeGen would have to recognize the shuffle reduction if it does
> not do so already
>
>
>
Seems everything boils down to properly identifying the reduction chain. I
did look at your patch provided in earlier thread (similar discussion), it
was working for reduction chain of 4 elements (+(+(+( v1, v2) v3) v4).
However, when i tried it for 8 elements, it was asserting. I will look into
it. Thanks for getting back on this.

-- 
With regards,
Suyog Sarda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141129/a014405e/attachment.html>


More information about the llvm-dev mailing list