[PATCH] D30305: [LV] Consider non-consecutive vectorizable accesses in max VF selection

Tue Feb 28 15:17:11 PST 2017

mkuper added a comment.

LGTM

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:6341
+      //
+      if (T->isPointerTy() && !isConsecutiveLoadOrStore(&I) &&
+          !Legal->isAccessInterleaved(&I) && !Legal->isLegalGatherOrScatter(&I))
----------------
mssimpso wrote:
> mkuper wrote:
> > Why would this only apply to pointer types, though?  What's special about them?
> > (It looks like it was a heuristic of some sort, but I'm not sure it makes sense anymore.)
> I'm not sure it makes sense anymore either - I'm happy to remove it. It was added when we could only choose the VF based on the size of the largest type. Maybe the pointer size was just used in place of "a large scalar type size that will cause the max VF to be too small"?
> 
> I'm actually hoping we can enable -vectorizer-maximize-bandwidth at some point, though. For some context, once I commit D29466 and D29675, ARM/AArch64 should be prepared for the change. I discovered the bug here while testing those two patches (with  -vectorizer-maximize-bandwidth=false). I was expecting these patches to be NFC, but for the loop in the test case, we were choosing a very large VF by mistake.
Hm, right, if we have a loop that mostly works on i8, but gathers the pointers, we'll have a bad time with the MaxVF.
ILet's just keep ignoring it for now... :-)

And I'd really like to enable maximize-bandwidth as well. I need to run our tests again and see whether we have any regressions on x86.

https://reviews.llvm.org/D30305