[PATCH] D24796: [SLPVectorizer] Fix for PR25748: reduction vectorization after loop unrolling.

Michael Kuperstein via llvm-commits llvm-commits at lists.llvm.org
Thu Oct 6 12:19:11 PDT 2016


Right, the intrinsic issue is mostly orthogonal (and hard - so far, nobody
came up with a really good proposal.)

On Tue, Oct 4, 2016 at 8:46 AM, Alexey Bataev <a.bataev at hotmail.com> wrote:

> Hi Suyog,
>
> thanks for your comments.
>
> 1. I believe intrinsic is another problem and must be implemented in a
> different patch.
>
> 2. Checked it, works only for fast-math ops.
>
> Best regards,
> Alexey Bataev
>
> On 10/04/2016 05:41 PM, suyog sarda wrote:
>
> As far as i understand this patch tries to vectorize the horizontal sum in
> an unrolled loop in following manner :
>
> vec0 = shuffle<p[0], p[1], p[2], p[3], p[4], p[5], p[6], p[7]>
> vec1 = shuffle vec0 <p[4], p[5], p[6], p[7],undef, undef, undef, undef>
> vec2 = add vec0, vec1                ---------> this will result in
> <p[0]+p[4], p[1]+p[5], p[2]+p[6], p[3]+p[7], undef, undef, undef, undef>
> vec3 = shuffle vec2 <p[2]+p[6], p[3]+p[7], undef, undef, undef, undef,
> undef, undef>
> vec4 = add vec2, vec3                ---------> this will result in
> <p[0]+p[4]+p[2]+p[6], p[1]+p[5]+p[3]+p[7], undef, undef, undef, undef,
> undef, undef>
> vec5 = shuffle vec4<p[1]+p[5]+p[3]+p[7], undef, undef, undef, undef,
> undef, undef, undef>
> vec6 = add vec4, vec5                ---------> this will result in
> <p[0]+p[4]+p[2]+p[6] +p[1]+p[5]+p[3]+p[7], undef, undef, undef, undef,
> undef, undef, undef>
> sum = extractelement vec6, 0
>
> This was discussed earlier too (https://marc.info/?l=llvm-
> dev&m=141106671810521&w=4)
> in the similar manner and it was suggested to generate intrinsic for
> horizontal sum so that it can be lowered to target specific code.
>
> Also, does this patch takes care of floating point ops as described in
> https://marc.info/?l=llvm-commits&m=141892087031143&w=3
> I haven't checked the patch. Just pitching in with some relevant data in
> the past.
>
> Regards,
> Suyog
>
> On Tue, Oct 4, 2016 at 5:31 PM, Alexey Bataev <a.bataev at hotmail.com>
> wrote:
>
>> ABataev updated this revision to Diff 73458.
>> ABataev added a comment.
>>
>> Added a comment + updated
>>
>>
>> https://reviews.llvm.org/D24796
>>
>> Files:
>>   lib/Transforms/Vectorize/SLPVectorizer.cpp
>>   test/Transforms/SLPVectorizer/X86/reduction_unrolled.ll
>>   test/Transforms/SLPVectorizer/X86/scheduling.ll
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161006/54f0f1c8/attachment.html>


More information about the llvm-commits mailing list