[PATCH] Vectorizing Global Structures - Take 2

Arnold Schwaighofer aschwaighofer at apple.com
Thu Feb 14 09:46:47 PST 2013


On Feb 14, 2013, at 10:39 AM, Renato Golin <renato.golin at linaro.org> wrote:

> On 12 February 2013 19:47, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:
> Alias analysis is not dependence analysis and as such it does not know about strides. It will only answer questions such as "Does access to location A and access to location B possibly alias". The Location object encapsulates how big our access is and the address we are accessing.
> 
> AFAICS, the Location for a pointer has Size of the pointer's type. So a Location would have to have its Size changed after we have used getLocation().
> 
> I'm guessing AA->alias(ThisLoc.getWithNewSize(VF*Size), ThatLoc.getWithNewSize(VF*Size)); or something like that.
> 

Probably, I would have to look at the details myself. But this looks reasonable.

>  
> The existing analysis in the loop vectorizer is conservative and will say if there is both a load and store to A via two different pointers (for example &A[i], &A[i+1] it will give up.
> 
> Ok, now I see the extension of the current implementation. Maybe I should be worrying with array alias first, though I have a feeling that the implementation is very similar, if not identical, to the global structures, by using AA with the correct stride.
> 
> 
> If we add a query using alias analysis (we now allow several accesses of a common underlying object) we need to look at all accesses &X[…] for possible aliasing with &A[i] where X == A. (This is assuming we insert dynamic checks for unknown objects).
> 
> First I want to not rely on RT checks. I was hoping that AA would tell me "don't know" if it couldn't tell and I'd then bail.
> 

The existing implementation already relies on runtime checks (it has to make sure that an unknown object and a known object do not overlap). Yes, AA will conservatively return MayAlias/PartialAlias if it does not know two objects. You just have to make sure that you actually query it with that unknown object.

Say you have three access in your program: one has underlying object A, one has underlying object B, and one has an unknown underlying object U. If you just rely on AA, you have to query both pairs ((Access A), (Access O)) and ((Access B), (Access O)).


> 
> Note the difference between &A[i] , which is what Store->getPointerOperand() returns, and A which is what GetUnderlyingObject(Store->getPointerOperand()) returns.
> 
> If we see a store to &A[i] we now need to look at all other memory accesses to see whether they alias with it. In the multimap you only store &A[i] so you can't query for all  other objects accessing A.
> 
> So, my idea was the following (braces indicate new behaviour):
> 
> * Store all write pointers (and respective stores)
> * Store all read pointers (and respective loads)
> (...)
> * For each write (pointer, store) -> calculate underlying object
>   * Have I seen it?
>     * (Is it not a GlobalValue) [1] <- this simulates old behaviour for what I don't care
>       * bail
>     * (does this store alias with other stores I have seen?) [2]
>       * bail
> * For each read -> calculate underlying object
>   * Have I seen it being written to?
>     * (Is it not a GlobalValue) [1] <- this simulates old behaviour for what I don't care
>       * bail
>     * (does this load alias with other stores I have seen?) [2]
>       * (bail)
> 
> [1] Not sure this is correct for all cases
   I think it  should be.

> [2] Updating the correct range
    What do mean by updating the correct range.
> 
> The point here being that I'll only check aliasing of location of the stores and loads based on the fact that I already found that their underlying objects might alias, which means that all other cases will go unnoticed.
> 
> This strikes me as very close to what you're suggesting…

Yes this looks right.
> 
> 
> I also believe that if you implement this, that alias analysis will tell you that you have possibly overlapping accesses in your example:
> 
> struct {
> int a[100];
> int b[100];
> } S;
> 
> because
> 
> alias((&S.a[99], 4xsizeof(int)), &S.b[0], 4xsizeof(int)) == partial alias
> 
> That's great! So I can rely on AA to do the hard work for me, just need to give it the correct Size.
> 
> 

I think you read to quickly :). Partial alias is bad. You want NoAlias. I am warning that if we use alias analysis the way we have to we might not get the answer we would like to see.

Because the unvectorized code looks like:

for (i in 0..99):
  access(S.a[i], sizeof(int)) 

and we have to change the size of the access

  access(S.a[i], sizeof(int)*VF))

AA will analyze address computations (it looks at the existing IR which still has the scalar bounds):

(&S.a[0.99], sizeof(int)*VF) and (&S.b[0], sizeof(int)*VF)

and there is an overlap due to the increased access size.



> cheers,
> --renato
> 
> PS: Having separated ReadWrites and WriteObject makes me search every write to every write N times, which is not efficient. I'll try to mitigate that once I know that at least the algorithm is correct.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130214/ae2ec383/attachment.html>


More information about the llvm-commits mailing list