[LLVMdev] Alias analysis and instruction level parallelism
Chris Lattner
sabre at nondot.org
Thu Apr 3 19:25:16 PDT 2008
On Apr 3, 2008, at 2:20 PM, Pertti Kellomäki wrote:
> Dan Gohman wrote:
>> I think this is trickier than it sounds; the reason GEPs are lowered
>> is to
>> allow strength-reduction and other things to do transformations on
>> them.
>> It would require those passes to know how to update the mapping.
>
> Yes, I do appreciate the amount of work involved, and I am
> very open to other suggestions.
How about a much simpler approach! Here's a silly, but reasonable
example:
int A[100];
void test(int x) {
while (x)
A[--x] = 0;
}
we compile this into (x86 with pic codegen):
..
LBB1_1: ## bb.preheader
movl L_A$non_lazy_ptr, %ecx
leal -4(%ecx,%eax,4), %ecx
xorl %edx, %edx
.align 4,0x90
LBB1_2: ## bb
movl $0, (%ecx)
addl $4294967292, %ecx
incl %edx
cmpl %eax, %edx
jne LBB1_2 ## bb
...
This is pretty reasonable code, what does the output of lsr look like
though?
$ llvm-gcc t.c -S -o - -O3 -emit-llvm | llvm-as | llc -print-isel-
input -relocation-model=pic
...
%tmp = mul i32 %x, 4 ; <i32> [#uses=1]
%tmp2 = add i32 ptrtoint ([100 x i32]* @A to i32), %tmp ; <i32>
[#uses=1]
%tmp4 = add i32 %tmp2, -4 ; <i32> [#uses=1]
br label %bb
bb: ; preds = %bb.preheader, %bb
%iv.5 = phi i32 [ %tmp4, %bb.preheader ], [ %iv.5.inc, %bb ] ; <i32>
[#uses=2]
...
%iv.57 = inttoptr i32 %iv.5 to i32* ; <i32*> [#uses=1]
store i32 0, i32* %iv.57, align 4
...
%iv.5.inc = add i32 %iv.5, -4 ; <i32> [#uses=1]
br i1 %exitcond, label %return, label %bb
return: ; preds = %bb, %entry
ret void
Wow, that's horrible! No wonder basicaa gets confused, I don't blame
it. However, none of this is needed. It would be much better for LSR
to lower the code into:
%tmp4 = getelementptr [100 x i32]* @A, i32 0, i32 %x ; i32*
br label %bb
bb:
%iv.5 = phi i32* [tmp4], [iv.5.inc]
..
store i32 0, i32* iv.5
..
%iv.5.inc = getelementptr i32* %iv.5, i32 -1
br i1 %exitcond, label %return, label %bb
return: ; preds = %bb, %entry
ret void
With this code, basicaa will have little problem understanding what is
going on, and generating this from LSR should not be very hard at
all. Better yet, no new crazy infrastructure change is required.
What do you think?
-Chris
More information about the llvm-dev
mailing list