[PATCH] D18940: Loop vectorization with uniform load

Mon Apr 11 09:55:53 PDT 2016

anemet added a subscriber: anemet.
anemet added a comment.

> If the "uniform load" is not hoisted before vectorization, the cost of the uniform load is "scalar load + broadcast".

This is not necessarily a criticism of your change but in most cases this cost is still too conservative.

If we need memchecks to disambiguate the the uniform access against the other memory accesses, the uniform load becomes loop-invariant after vectorization and a subsequent LICM will hoist it out of the loop (http://reviews.llvm.org/D17191).  Thus the cost is really zero.

I also think that this is the common case, like your testcase.  Just schedule a licm afterward and the load+shuffles will be hoisted out of the loop.

> It is not correctly calculated in the current version and a huge cost for one splat vector prevents loop vectorization.

Can you please elaborate, how was the cost computed before?

Thanks,
Adam

Repository:
  rL LLVM

http://reviews.llvm.org/D18940