[PATCH] Vectorizing Global Structures - Take 2

Arnold Schwaighofer aschwaighofer at apple.com
Tue Feb 12 09:43:35 PST 2013


Hi Renato,

The existing Analysis tests for "overlap" between memory regions (underlying pointers). As such, it can ignore the access pattern and can operate using just the pointer and the underlying object.

When we are using alias analysis we need to be extremely careful that we look at *all* loads/stores, what their - and this is important - vectorized location would be to determine whether there is an overlap.

int A[N];
 for (i = 0; i < N-4; ++i)
   A[i+1] = A[i]

If we ask "AA->alias((&A[i+1], 4), (&A[i], 4))" we could get false because those two locations do indeed not alias. However, we can clearly not vectorize this.

We will have to ask "Do the future vectorized accesses overlap?" Something like:

AA->alias((&A[i+1], MaxVF*4), (&A[i], MaxVF*4)).

Also, for the writes you use the underlying object to look up a value in the ReadWrites multi-map that is populated with "getPointerOperand()" which is not necessarily the underlying object.

+        // If global alias, make sure they do alias
+        std::pair<StoreAliasMap::iterator, StoreAliasMap::iterator> range =
+            ReadWrites.equal_range(*UI);

You will ignore many interesting cases and incorrectly return that it is safe to vectorize.

And for the reads you use the pointer operand to look up in the multi map which won't give you all the accesses underlying objects (the interesting accesses will have a different getPointerOperand).

+      AliasAnalysis::Location ThisLoc = AA->getLocation((*MIL).second);
+      std::pair<StoreAliasMap::iterator, StoreAliasMap::iterator> range =
+          ReadWrites.equal_range(Val);


I would also need to spend some quality time thinking about whether ignoring store/stores to overlapping memory locations is okay if there is not overlapping read.



Best,
Arnold


On Feb 11, 2013, at 10:40 AM, Renato Golin <renato.golin at linaro.org> wrote:

> Hi all,
> 
> Second attempt, using Alias Analysis and trying to avoid extra work for previous behaviour at all costs. 
> 
> Disclaimer: It's still not good enough, I'm not paying much attention to details (containers, iterations, variable names, formatting), but to make sure the code is sensible and not adding too much extra cost. I'll create more tests when at least I know that the path is correct.
> 
> The extra costs are mainly memory, not CPU. For instance, I'm now using multimap<Value*, Instruction*> instead of SmallVector<Value*>, because I need to know the original store instructions associated with a particular value (pointer operand). Do I need this? Or just iterating over all uses is more efficient?
> 
> Another issue is that I'm trying to drop as early as possible, so I test if the value is a GlobalValue or not, and only when it is, I try the AA->alias(), failing if not (safe bet). Again, GlobalValue could be too generic, and maybe I need to be more specific (a global struct, for instance).
> 
> I thought about using a Location cache (between write loop and read loop), but calculating the location doesn't seem too complicated...
> 
> Finally, the part that checks for alias is similar on Write and Read loops, but extracting it would require me to typedef the multimap outside the class. If the code ends up identical, I'll move the typedef to a private part of the class.
> 
> Comments welcome! ;)
> 
> cheers,
> --renato
> <global_vectorize.patch>




More information about the llvm-commits mailing list