[PATCH] D36130: [SLP] Vectorize jumbled memory loads.
Shahid via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 6 08:22:13 PST 2018
ashahid added inline comments.
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1660
+ "permutation of loaded lanes.\n");
+ newTreeEntry(Sorted, /*Vectorized*/ true, UserTreeIdx,
+ ReuseShuffleIndicies, Mask);
----------------
ABataev wrote:
> ashahid wrote:
> > ABataev wrote:
> > > No, use original `VL` here, do not use `Sorted`. In this case you won't need an additional argument in `sortLoadAccesses` and you don't need all that complex stuff with the lambda on line 3528
> > If I am not wrong, for LOADs, VL0 must be the 1st element of the buffer whose base address will be used for vector load.
> > So using VL will break this assumption.
> Why? And why you can't choose the right VL0 during vectorization?
For example, if we have two arrays A[4] and B[1] laying one after another in memory and the selected VF is 4 for the scalar loads of A[1], A[2], A[0], A[3] in order of use, the generated vector load will load the elements A[1], A[2], A[3], B[1] which is not desired.
Of-course we can choose the right VL0 during vectorization but we have to compute it again here using the mask which can be avoided if we use Sorted VL.
If I am missing something?
Repository:
rL LLVM
https://reviews.llvm.org/D36130
More information about the llvm-commits
mailing list