[LLVMdev] alias set collapse and LICM

Daniel Berlin dberlin at dberlin.org
Mon Apr 27 17:47:03 PDT 2015


On Mon, Apr 27, 2015 at 5:32 PM, Sanjoy Das
<sanjoy at playingwithpointers.com> wrote:
> On Mon, Apr 27, 2015 at 4:21 PM, Daniel Berlin <dberlin at dberlin.org> wrote:
>> You can't win here (believe me, i've tried, and better people than me have
>> tried, for years :P).
>> No matter what you do, the partitioning will never be 100% precise.  The
>> only way to solve that in general is to pairwise query over the
>> partitioning.
>>
>> Your basic problem is that all of the alias analyses do not come up with the
>> same partitioning of the heap.
>>
>> Worse, some are not even transitive, or even symmetric.
>
> Am I correct in saying that the lack of transitivity fundamentally
> cannot be helped -- a MayAlias composed with a MayAlias is not a
> MayAlias.

Correct.  I also imagine you can abuse noalias to get a noalias b, b
noalias c, a mayalias c.

(But i haven't thought too hard about it)


As a trivial
>  The simplest example of this I can come up with is {A ,
> C?A:B , B} where A `NoAlias` B and C is a runtime value that cannot be
> computed at compile-time.  Then A `MayAlias` C?A:B and C?A:B
> `MayAlias` B but A `NoAlias` B.

Yes.

l
>
>> This means one single variable may be in multiple disjoint alias sets in
>> reality.
>
> Agreed.  I did not propose solving the problem within ASTracker for
> this very reason.
>
>> (For example, GEP, PHI will give you different AA results than PHI, GEP in
>> some cases. That is mostly a FIXME, but there are plenty of cases you can
>> come up with where info tells you something about one variable and it's
>> relationship to another, but not to a third. The non-symmetric case is
>> rarer, but again, possible.)
>>
>> Let's say you modify AliasSetTracker to not collapse, and handle multiple
>> disjoint partitions at once.
>>
>> (Here is here i will draw the equivalence between this and what we've done
>> before in GCC :) )
>>
>> The only way to maintain correctness (in general) and not collapse is to
>> maintain multiple alias sets, and say which are affected by which
>> instructions.
>>
>> (IE you have 1 instruction, and it may MOD 5 different alias sets, each of
>> which represents a different partitioning of the heap).
>>
>> You will discover very quickly, that for most partitioning you can think of,
>> the cost of maintaining the alias sets over a set of instructions with is
>> greater than the cost of additional queries you may come up with in the
>> first place
>>
>> In other words, depending on how precise your partitioning you may have
>> variables that clobber 10 or 20 or 30 of the disjoint alias sets.  Handling
>> this is a lot more expensive than saying "hey, i want to move this one load
>> up another instruction. Does it really conflict?".  This is because you
>> really only care about individual aliases in the alias set at any given
>> time, but unless you go through all the addresses ahead of time, and had a
>> way to say "okay alias set tracker, i only care about variable x, y, z, and
>> even then, only x in block a, y in block b, etc", you are paying the cost of
>> doing whatever queries are necessary to prove something about *all*
>> variables in an alias set.
>>
>> MemorySSA was essentially built with this problem in mind (and thus, tries
>> to provide chains that let you do querying on.  You ask it about a load, it
>> can O(1) tell you the nearest dominating thing that MOD's it (IE to which
>> getModRefInfo(Instruction, Location) says Mod).
>
> So my proposal is somewhat related to what you're suggesting --
> tracking the stores through a separate ASTracker lets me basically ask
> a conservative version of getModRefInfo over the entire loop body.

Yes.
MemorySSA just lets you do this per-load instead.

> And given that the loop body will be strongly connected, I think the
> list of clobbering stores reported by MemorySSA will be exactly equal
> to the list of stores reported by AST, control flow should not matter
> since everything in the loop body reaches everything else.

Depends how good AST is.

Given a loop like this:

; Function Attrs: nounwind ssp
define void @test10(i32 %N, double* nocapture %G) #0 {
entry:
  %0 = add i32 %N, -1
  %1 = icmp sgt i32 %0, 1
  br i1 %1, label %bb.nph, label %return

bb.nph:                                           ; preds = %entry
  %tmp = sext i32 %0 to i64
  %tmp8 = add i64 %tmp, -1
  br label %bb

bb:                                               ; preds = %bb, %bb.nph
  %indvar = phi i64 [ 0, %bb.nph ], [ %tmp11, %bb ]
  %scevgep = getelementptr double, double* %G, i64 %indvar
  %tmp9 = add i64 %indvar, 2
  %scevgep10 = getelementptr double, double* %G, i64 %tmp9
  %tmp11 = add i64 %indvar, 1
  %scevgep12 = getelementptr double, double* %G, i64 %tmp11
  %2 = load double, double* %scevgep12, align 8
  %3 = load double, double* %scevgep10, align 8
  %4 = fadd double %2, %3
  %5 = load double, double* %scevgep, align 8
  %6 = fadd double %4, %5
  store double %6, double* %scevgep12, align 8
  %exitcond = icmp eq i64 %tmp11, %tmp8
  br i1 %exitcond, label %return, label %bb

return:                                           ; preds = %bb, %entry
  ret void
}


MemorySSA will return that load scevgep12 is not clobbered in the loop
(in one path through the phi it leads all the way back to the entry
block, in the other path, it leads all the way back to the entry block
because scevgep12 becomes scevgep10 along that path), load scevgep10
gets back to live on entry block and is not clobbered by the loop, and
load scevgep is clobbered by the store below it.

IE:
; Function Attrs: nounwind ssp
define void @test10(i32 %N, double* nocapture %G) #0 {
entry:
  %0 = add i32 %N, -1
  %1 = icmp sgt i32 %0, 1
  br i1 %1, label %bb.nph, label %return

bb.nph:                                           ; preds = %entry
  %tmp = sext i32 %0 to i64
  %tmp8 = add i64 %tmp, -1
  br label %bb

bb:                                               ; preds = %bb, %bb.nph
; 3 = MemoryPhi({bb.nph,liveOnEntry},{bb,1})
  %indvar = phi i64 [ 0, %bb.nph ], [ %tmp11, %bb ]
  %scevgep = getelementptr double, double* %G, i64 %indvar
  %tmp9 = add i64 %indvar, 2
  %scevgep10 = getelementptr double, double* %G, i64 %tmp9
  %tmp11 = add i64 %indvar, 1
  %scevgep12 = getelementptr double, double* %G, i64 %tmp11
; MemoryUse(liveOnEntry)
  %2 = load double, double* %scevgep12, align 8
; MemoryUse(liveOnEntry)
  %3 = load double, double* %scevgep10, align 8
  %4 = fadd double %2, %3
; MemoryUse(3)
  %5 = load double, double* %scevgep, align 8
  %6 = fadd double %4, %5
; 1 = MemoryDef(3)
  store double %6, double* %scevgep12, align 8
  %exitcond = icmp eq i64 %tmp11, %tmp8
  br i1 %exitcond, label %return, label %bb

return:                                           ; preds = %bb, %entry
; 2 = MemoryPhi({entry,liveOnEntry},{bb,1})
  ret void
}

>
> -- Sanjoy



More information about the llvm-dev mailing list