[PATCH] D24796: [SLPVectorizer] Fix for PR25748: reduction vectorization after loop unrolling.

Alexey Bataev via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 4 08:46:44 PDT 2016


Hi Suyog,

thanks for your comments.

1. I believe intrinsic is another problem and must be implemented in a different patch.

2. Checked it, works only for fast-math ops.

Best regards,
Alexey Bataev

On 10/04/2016 05:41 PM, suyog sarda wrote:
As far as i understand this patch tries to vectorize the horizontal sum in an unrolled loop in following manner :

vec0 = shuffle<p[0], p[1], p[2], p[3], p[4], p[5], p[6], p[7]>
vec1 = shuffle vec0 <p[4], p[5], p[6], p[7],undef, undef, undef, undef>
vec2 = add vec0, vec1                ---------> this will result in <p[0]+p[4], p[1]+p[5], p[2]+p[6], p[3]+p[7], undef, undef, undef, undef>
vec3 = shuffle vec2 <p[2]+p[6], p[3]+p[7], undef, undef, undef, undef, undef, undef>
vec4 = add vec2, vec3                ---------> this will result in <p[0]+p[4]+p[2]+p[6], p[1]+p[5]+p[3]+p[7], undef, undef, undef, undef, undef, undef>
vec5 = shuffle vec4<p[1]+p[5]+p[3]+p[7], undef, undef, undef, undef, undef, undef, undef>
vec6 = add vec4, vec5                ---------> this will result in <p[0]+p[4]+p[2]+p[6] +p[1]+p[5]+p[3]+p[7], undef, undef, undef, undef, undef, undef, undef>
sum = extractelement vec6, 0

This was discussed earlier too (https://marc.info/?l=llvm-dev&m=141106671810521&w=4)
in the similar manner and it was suggested to generate intrinsic for horizontal sum so that it can be lowered to target specific code.

Also, does this patch takes care of floating point ops as described in https://marc.info/?l=llvm-commits&m=141892087031143&w=3
I haven't checked the patch. Just pitching in with some relevant data in the past.

Regards,
Suyog

On Tue, Oct 4, 2016 at 5:31 PM, Alexey Bataev <a.bataev at hotmail.com<mailto:a.bataev at hotmail.com>> wrote:
ABataev updated this revision to Diff 73458.
ABataev added a comment.

Added a comment + updated


https://reviews.llvm.org/D24796

Files:
  lib/Transforms/Vectorize/SLPVectorizer.cpp
  test/Transforms/SLPVectorizer/X86/reduction_unrolled.ll
  test/Transforms/SLPVectorizer/X86/scheduling.ll



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161004/cfc6830f/attachment.html>


More information about the llvm-commits mailing list