[PATCH] D18940: Loop vectorization with uniform load
Adam Nemet via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 11 09:55:53 PDT 2016
anemet added a subscriber: anemet.
anemet added a comment.
> If the "uniform load" is not hoisted before vectorization, the cost of the uniform load is "scalar load + broadcast".
This is not necessarily a criticism of your change but in most cases this cost is still too conservative.
If we need memchecks to disambiguate the the uniform access against the other memory accesses, the uniform load becomes loop-invariant after vectorization and a subsequent LICM will hoist it out of the loop (http://reviews.llvm.org/D17191). Thus the cost is really zero.
I also think that this is the common case, like your testcase. Just schedule a licm afterward and the load+shuffles will be hoisted out of the loop.
> It is not correctly calculated in the current version and a huge cost for one splat vector prevents loop vectorization.
Can you please elaborate, how was the cost computed before?
Thanks,
Adam
Repository:
rL LLVM
http://reviews.llvm.org/D18940
More information about the llvm-commits
mailing list