[PATCH] D36130: [SLP] Vectorize jumbled memory loads.

Tue Mar 6 08:22:13 PST 2018

ashahid added inline comments.

================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:1660
+                            "permutation of loaded lanes.\n");
+            newTreeEntry(Sorted, /*Vectorized*/ true, UserTreeIdx,
+                         ReuseShuffleIndicies, Mask);
----------------
ABataev wrote:
> ashahid wrote:
> > ABataev wrote:
> > > No, use original `VL` here, do not use `Sorted`. In this case you won't need an additional argument in `sortLoadAccesses` and you don't need all that complex stuff with the lambda on line 3528
> > If I am not wrong, for LOADs, VL0 must be the 1st element of the buffer whose base address will be used for vector load. 
> > So using VL will break this assumption.
> Why? And why you can't choose the right VL0 during vectorization?
For example, if we have two arrays A[4] and B[1] laying one after another in memory and the selected VF is 4 for the scalar loads of A[1], A[2], A[0], A[3] in order of use, the generated vector load will load the elements A[1], A[2], A[3], B[1] which is not desired.

Of-course we can choose the right VL0 during vectorization but we have to compute it again here using the mask which can be avoided if we use Sorted VL.

If I am missing something? 

Repository:
  rL LLVM

https://reviews.llvm.org/D36130