[PATCH] D18940: Loop vectorization with uniform load

Mon Apr 11 14:06:42 PDT 2016

anemet added a comment.

In http://reviews.llvm.org/D18940#397500, @delena wrote:

> > Can you please elaborate, how was the cost computed before?
>
>
> 25 for VF=2, 51 for VF=4 and 103 for VF=8

I don't mean the actual number.

Did we assume that we needed VF number of loads for each element rather than a single one with a shuffle/broadcast?

I am just trying to understand the before-picture.  You only said that we were building a splat but that is true even after.

> > Thus the cost is really zero.

> 

> 

> I know that the actual cost is 0, but I can't put 0 when the load is inside the loop.

Why not, if we know that it will be hoisted out?  I don't see a way how this load wouldn't be loop-invariant if it's legal to vectorize the loop.  For example, in:

for (i = 0; i < 10; i++) {

   .. = a[5]
  a[i] = ...

}

a[5] is loop-variant but dependence analysis would not allow this loop to be vectorized because the dependence distance between a[5] and a[i] is not constant.

> > the uniform load becomes loop-invariant after vectorization and a subsequent LICM will hoist it out of the loop

> 

> 

> Do you know why the load wasn't hoisted before vectorization?

Because it requires multi-versioning of the loop with memchecks because we couldn't disambiguate the invariant load against the stores in the loop at compile time.

LICM does not currently perform multiversioning by default.

Repository:
  rL LLVM

http://reviews.llvm.org/D18940