[PATCH] D26905: [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.

Michael Kuperstein via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 6 14:46:31 PST 2017


mkuper added inline comments.


================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:468
+      assert(VL.size() == Scalars.size() && "Invalid size");
+      for (auto *Sval : Scalars)
+        if (llvm::none_of(VL, [&](Value *Val) { return Val == Sval; }))
----------------
I'm still not sure we want this to be quadratic.
I'd suggest one of two things:
1) Change this to presort. For VL.size() == 4, it may be slower, but for VL.size() == 16, I'd expect it to be faster. 
2) If there's evidence that presorting is actually bad  for small sizes, add a FIXME and bail out for VL.size() > 16. I'd prefer for us to fail to vectorize at larger VLs, than silently introduce a quadratic algorithm for larger Ns.


https://reviews.llvm.org/D26905





More information about the llvm-commits mailing list