[PATCH] [SLPVectorization] Enhance Ability to Vectorize Horizontal Reductions from Consecutive Loads
suyog
suyog.sarda at samsung.com
Sun Dec 21 09:49:32 PST 2014
Hi Michael,
Thanks for the review. I will take care of the typos and extra space in fresh upload.
For the logic part, please find my comments inline.
Your comments are awaited :)
REPOSITORY
rL LLVM
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1838-1841
@@ +1837,6 @@
+ return;
+ if (!(isConsecutiveAccess(Left[i], Right[i])))
+ continue;
+ else
+ std::swap(Left[i + 1], Right[i]);
+ }
----------------
mzolotukhin wrote:
> I think we should only swap if `Left[i]` and `Left[i+1]` are consecutive - that's the only case we get something from this reordering.
>
> In your current approach we might lose consecutiveness in `Right[i]` and `Right[i+1]` by swapping `Left[i+1]` and `Right[i]` even if later on `Left[i]` and `Left[i+1]` won't be consecutive.
For consecutive loads, if Left[i] and Left[i+1] are consecutive, then we will never arrive at this check, since they are already consecutive and can be bundled into a vector.
Lets take an example :
return (a[0] + a[1]) + (a[2] + a[3])
the tree for this will be :
+
/ \
/ \
+ +
/ \ / \
/ \ / \
a[0] a[1] a[2] a[3]
where
Left Right
i=0 a[0] a[1]
i=1 a[2] a[3]
(Please note the contents of Left and Right :). Seems confusing, but its the way :) )
here Left[0] = a[0] and Left[1] = a[2] which are not consecutive and hence, cannot be bundled. Right[0] = a[1] and Right[1] = a[3], which are also not consecutive and hence cannot be bundled.
But here, Left[i] and Right[i] are consecutive and hence we exchange Left[i+1] and Right[i]. The Left and Right after formed after re-arranging :
Left Right
i=0 a[0] a[2]
i=1 a[1] a[3]
Now since, Left[i] and Left[i+1] are consecutive, they can be bundled into a vector. Same with Right.
Now, this disturbs the original addition, since we were supposed to add a[0] with a[1] and a[2] with a[3] and finally add their additions.
But after rearranging, which in turn vectorizes the code, we are now adding
a[0] a[1]
+ +
a[2] a[3]
This doesn't affect the result for integers, but for floating point data types with precision issues, this might cause difference in final answer. Hence, we are doing this re-arrangements for integer data types only.
I think you considered Left will contain left subtree (a[0] and a[1]) while Right will contain right subtree (a[2] and a[3]), which is not so. Please confirm and also correct my understanding if wrong :)
Your suggestions/corrections are most welcomed :)
http://reviews.llvm.org/D6675
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list