[PATCH] D26905: [SLP] Vectorize loads of consecutive memory accesses, accessed in non-consecutive (jumbled) way.
Michael Kuperstein via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 6 14:46:31 PST 2017
mkuper added inline comments.
================
Comment at: lib/Transforms/Vectorize/SLPVectorizer.cpp:468
+ assert(VL.size() == Scalars.size() && "Invalid size");
+ for (auto *Sval : Scalars)
+ if (llvm::none_of(VL, [&](Value *Val) { return Val == Sval; }))
----------------
I'm still not sure we want this to be quadratic.
I'd suggest one of two things:
1) Change this to presort. For VL.size() == 4, it may be slower, but for VL.size() == 16, I'd expect it to be faster.
2) If there's evidence that presorting is actually bad for small sizes, add a FIXME and bail out for VL.size() > 16. I'd prefer for us to fail to vectorize at larger VLs, than silently introduce a quadratic algorithm for larger Ns.
https://reviews.llvm.org/D26905
More information about the llvm-commits
mailing list