[LLVMdev] alias set collapse and LICM

Fri Jun 12 16:14:01 PDT 2015

> On Jun 12, 2015, at 2:52 PM, Philip Reames <listmail at philipreames.com> wrote:
> 
> 
> 
> On 06/12/2015 02:10 PM, Andrew Trick wrote:
>>> On Jun 12, 2015, at 2:06 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
>>> 
>>> On Fri, Jun 12, 2015 at 2:03 PM, Andrew Trick <atrick at apple.com> wrote:
>>>> On Jun 12, 2015, at 1:51 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
>>>> 
>>>> So, you can't have disjoint sets, and have one of set that says "i
>>>> access everything". Because it would contain everything :)
>>>> 
>>>> Thus, I assume by your description you meant "we want to maintain
>>>> multiple disjoint partitions of the same set of accesses" (which is
>>>> not quite the same as just having one special partition).
>>>> 
>>>> When one partition is a refinement of the other (as should be the case
>>>> here), this is known as the nested set union problem, and in fact,
>>>> some very recent research was done into it:
>>>> 
>>>> http://www.cs.princeton.edu/~dhlarkin/papers/esa14-nested-set-union.pdf
>>>> 
>>>> 
>>>> I’m suggesting that the tracker have a special set that is not included in
>>>> the partition and assumed to alias with everything in the partition. So we
>>>> would have a partition over the subset of memory accesses that are actually
>>>> interesting. All the sets in that partition are disjoint. The set outside of
>>>> the partition is not. Every time you query for aliasing, you need to check
>>>> if one of the accesses is in the special set.
>>> Ah, so this seems identical to what Sanjoy suggested initially :)
>> I think he suggested maintaining two partitions, one for Refs and one for Mods. ModRef accesses would be in both partitions. That might also be helpful, but might not be necessary if all the readonly calls are simply removed from the partition.
> I just want to point out that the readonly case is illustrative of a broader issue.  The alias set tracker mechanism is currently inherently transitive.  That is, a load can be considered clobbered even if doesn't alias a store; if it aliases another load which aliases the store, it's considered clobbered.  The readonly call is the case we've hit and the one that really matters to us, but it's worth keeping the broader problem in mind.
> 
> I came across this issue again this week and ended up implementing a much less principled fix in our local tree.  For sufficiently small loops, we now resort to pair wise modref checks if AST reports a possible clobber.  In several test cases I looked at, the precise aliasing was sufficient to enable LICM sufficient for us to recognize vectorizable loops which were otherwise getting missed. As you can imagine, that's a major performance win on those benchmarks.
> 
> For upstream, I now know of four possible implementation strategies:
> 
> Option 1 - Resort to pairwise modref queries for sufficiently small loops.  This has the advantage of being already implemented and easy to upstream, but would introduce a capped N^2 algorithm and thus a performance cliff.  It might be worth upstreaming this with the threshold defaulted to 0 (i.e. nop) just as a testing mechanism for identifying places AST's imprecision is hurting us.
> 
> Option 2 - As a variant of option 1, we can keep track of potentially clobbering instructions seen during a single walk of the loop.  With this, the above N^2 algorithm becomes #L*#S, and we can easily bound the maximum number of clobbering instructions we track.  I think this is actually a reasonable compromise and is worth considering.  This gives us fully precise aliasing for most loops within LICM at reasonable cost.
> 
> Option 3 - We can track readonly calls separately within AST.  This wouldn't change the inherent purpose of AST (i.e. it would still be about transitive aliasing), but would enable more precise results in the specific case we care about.  This only helps the readonly call case and does not help with general aliasing collapse issue.  It also requires every consumer of the AST interface to be updated to explicit check the "alias all" set.  (Note: This could help other "unknown instruction" cases as well as readonly calls.  It just doesn't solve the imprecise aliasing transitivity for known instructions.)
> 
> Option 4 - We can split AST into two sets of alias sets.  One set of set is based on aliasing ref behaviour, the other on mod behaviour. The rules for adding to the first set of sets are the existing AST rules.  A load would only be added to an mod based set if a store within the set aliased it directly.  We would still merge all of sets that a load belong to.  The difference is that a load would not be considered to belong to a set unless a store within that set aliases it, not just another load.  With LICM, we'd use the mod-set for hoisting of loads, and the ref-set for sinking of stores.  This is a fairly involved change.
> 
> My mild preference would be for option 2, followed by 4, then 1, then 3.  What do others think?  I'm willing to implement any of 1, 2, and likely 3.  Option 4 is likely more work than I can commit to given that the symptom biting me is now fixed locally.

Makes sense to me. I think Option 3 is also more work than anyone wants to take on. I suggested it because what you’re calling a special case is really the common and interesting case, and intuitively AST is almost useless at tracking reads without handling unknown readers. But looking at where AST is used, I don’t think it makes sense for it to track reads anyway.

For sinking, I think you could give up on the pairwise alias check after the first unsuccessful attempt in each AliasSet (tracking mods only) without losing useful precision.

Andy

> 
> Philip
>