[LLVMdev] Alias Analysis Problem in LICM

Nick Lewycky nicholas at mxc.ca
Fri Nov 4 01:29:00 PDT 2011


Gan wrote:
> Hi,
>
> I met an alias analysis problem in the LICM phase. I am using the following
> piece of code to present this problem.
>
>        1 int size;
>        2 int ** AAA;
>        3 void * xalloc(int);
>        4
>        5 void foo(void)
>        6 {
>        7   int i;
>        8   AAA = (int**) xalloc(size * sizeof(int*));
>        9
>       10   for (i=0; i<size; i++)
>       11     AAA[i] = 0;
>       12 }
>
> This code tries to initialize an array of pointers. The array is
> allocated from heap.
> "AAA" points to the beginning of the array and it is a global variable.
>
> I got the following IR dump after LICM:
>
>      82 *** IR Dump After Loop Invariant Code Motion ***
>      83 for.body:                                         ; preds =
> %for.body.lr.ph, %for.body
>      84   %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
>      85   %4 = load i32*** @AAA, align 4, !tbaa !3
>      86   %arrayidx = getelementptr inbounds i32** %4, i32 %i.02
>      87   store i32* null, i32** %arrayidx, align 4, !tbaa !3
>      88   %inc = add nsw i32 %i.02, 1
>      89   %cmp = icmp slt i32 %inc, %3
>      90   br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge
>
> I was expecting that, line 85: "%4 = load i32*** @AAA", would be
> treated as loop
> invariant code and be moved out of the loop body. However, it didn't.
> The reason is
> that, LLVM thinks that pointer "@AAA" would alias with pointer
> "&AAA[i]". I found GCC
> has no problem to move this load out of the loop body.
>
> Then I went to read the current LLVM (v2.9) alias analysis code, i.e.
> BasicAA and TBAA.
> I found that TBAA does not differentiate pointers. Any pointer will be
> given the same tbaa meta-data
> named "any pointer". After reading the TBAA code, I don't believe that
> my problem can be
> easily fixed in its framework.
>
> Since the "Anderson" alias analysis code is already moved out from
> LLVM, we can only count
> on BasicAA and TBAA for all alias analysis problems at this moment.
> But these two piece of code
> can only do very limited alias analysis and would become a big
> performance bottleneck to
> other optimization phases, e.g. instruction scheduling, LICM. Can
> anybody tell me:
>
> 1. How to solve the above alias analysis problem in an acceptable short time?
> 2. Is there any existing project that tries to enhance LLVM alias analysis?

Thanks for this testcase, I think this is a serious problem. I see two 
possible approaches, hopefully there are more.

   1. Use the fact that there's only one store and then make a select 
out of it. Pretending for a moment that partially overlapping writes 
don't exist over an i32*, the value loaded in %.pre is either %1 because 
we didn't overwrite it, or null because we did. Thus, we can transform 
it into:

     %cmp = icmp eq i32** %arrayidx2, bitcast (i32*** @AAA to i32**)
     %.pre = select i1 %cmp, i32** null, i32** %1

   which saves a load at the cost of a compare. Sadly, LLVM fails to 
optimize out the compare afterward. (You'd think that the prospect of 
storing through a null pointer would be enough for llvm to realize that 
one arm of the select is unreachable. That optimization would not be 
hard to add.)

   2. Find the least fixed point. Currently, your "xalloc" method could 
potentially return &AAA itself so the first step is to mark it as only 
returning pointers which do not alias any existing pointers, as such:

     void * xalloc(int) __attribute__((malloc));

   Then we're starting the loop with two pointers (&AAA and AAA) which 
we know do not alias, and we'll continue to assume that pointers can't 
alias until there is evidence to the contrary. Because we know they 
don't alias when entering the loop, we'll never encounter evidence that 
%arrayidx could alias @AAA, then GVN would eliminate the load %.pre 
against the store before the start of the loop.

   Unfortunately, this would require writing a new AA pass, as it's too 
expensive to add to -basicaa.

I suspect that GCC is doing option 2, or something else I haven't 
thought of (TBAA? if so, why doesn't our TBAA do as well?).

Nick



More information about the llvm-dev mailing list