[LLVMdev] Alias Analysis Problem in LICM

Thu Nov 3 14:07:31 PDT 2011

Hi,

I met an alias analysis problem in the LICM phase. I am using the following
piece of code to present this problem.

      1 int size;
      2 int ** AAA;
      3 void * xalloc(int);
      4
      5 void foo(void)
      6 {
      7   int i;
      8   AAA = (int**) xalloc(size * sizeof(int*));
      9
     10   for (i=0; i<size; i++)
     11     AAA[i] = 0;
     12 }

This code tries to initialize an array of pointers. The array is
allocated from heap.
"AAA" points to the beginning of the array and it is a global variable.

I got the following IR dump after LICM:

    82 *** IR Dump After Loop Invariant Code Motion ***
    83 for.body:                                         ; preds =
%for.body.lr.ph, %for.body
    84   %i.02 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
    85   %4 = load i32*** @AAA, align 4, !tbaa !3
    86   %arrayidx = getelementptr inbounds i32** %4, i32 %i.02
    87   store i32* null, i32** %arrayidx, align 4, !tbaa !3
    88   %inc = add nsw i32 %i.02, 1
    89   %cmp = icmp slt i32 %inc, %3
    90   br i1 %cmp, label %for.body, label %for.cond.for.end_crit_edge

I was expecting that, line 85: "%4 = load i32*** @AAA", would be
treated as loop
invariant code and be moved out of the loop body. However, it didn't.
The reason is
that, LLVM thinks that pointer "@AAA" would alias with pointer
"&AAA[i]". I found GCC
has no problem to move this load out of the loop body.

Then I went to read the current LLVM (v2.9) alias analysis code, i.e.
BasicAA and TBAA.
I found that TBAA does not differentiate pointers. Any pointer will be
given the same tbaa meta-data
named "any pointer". After reading the TBAA code, I don't believe that
my problem can be
easily fixed in its framework.

Since the "Anderson" alias analysis code is already moved out from
LLVM, we can only count
on BasicAA and TBAA for all alias analysis problems at this moment.
But these two piece of code
can only do very limited alias analysis and would become a big
performance bottleneck to
other optimization phases, e.g. instruction scheduling, LICM. Can
anybody tell me:

1. How to solve the above alias analysis problem in an acceptable short time?
2. Is there any existing project that tries to enhance LLVM alias analysis?

Thank you!

-- 
Best Regards

Gan