[PATCH] D27919: [Loop Vectorizer] Interleave vs Gather - in some cases Gather is better.

Tue Feb 7 09:36:18 PST 2017

mssimpso added inline comments.

================
Comment at: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7036
+        if (InterleaveCost < NumAccesses * 2) {
+          // The interleaving cost is good enough.
+          setWideningDecision(Group, VF, CM_DECISION_INTERLEAVE,
----------------
mkuper wrote:
> delena wrote:
> > mkuper wrote:
> > > Why the NumAccesses * 2 cut-off?
> > I consider a cost per instruction. In this case InterleaveCost / NumAccesses == 1. (Matthew asked to avoid divisions).
> > About 1 inst per access is good enough. I added more comments.
> Well, this isn't really "about 1", this is "below 2".
> I'd be more conservative here (InerelaveCost <= NumAccesses ? Can this even happen? Or are you trying to catch the cases where the ratio is ~1.1-1.2?). Or maybe remove this altogether. Is getGatherScatterCost() expensive in terms of compile time?
The TTI estimates are supposed to be cheap to compute. I think it makes sense to remove this altogether in favor of greater simplicity.

Repository:
  rL LLVM

https://reviews.llvm.org/D27919