[LLVMdev] Alias analysis and instruction level parallelism

Chris Lattner sabre at nondot.org
Thu Apr 3 19:25:16 PDT 2008

On Apr 3, 2008, at 2:20 PM, Pertti Kellomäki wrote:

> Dan Gohman wrote:
>> I think this is trickier than it sounds; the reason GEPs are lowered
>> is to
>> allow strength-reduction and other things to do transformations on  
>> them.
>> It would require those passes to know how to update the mapping.
> Yes, I do appreciate the amount of work involved, and I am
> very open to other suggestions.

How about a much simpler approach!  Here's a silly, but reasonable  

int A[100];

void test(int x) {
   while (x)
     A[--x] = 0;

we compile this into (x86 with pic codegen):

LBB1_1:	## bb.preheader
	movl	L_A$non_lazy_ptr, %ecx
	leal	-4(%ecx,%eax,4), %ecx
	xorl	%edx, %edx
	.align	4,0x90
LBB1_2:	## bb
	movl	$0, (%ecx)
	addl	$4294967292, %ecx
	incl	%edx
	cmpl	%eax, %edx
	jne	LBB1_2	## bb

This is pretty reasonable code, what does the output of lsr look like  

$ llvm-gcc t.c -S -o - -O3 -emit-llvm | llvm-as | llc -print-isel- 
input -relocation-model=pic
	%tmp = mul i32 %x, 4		; <i32> [#uses=1]
	%tmp2 = add i32 ptrtoint ([100 x i32]* @A to i32), %tmp		; <i32>  
	%tmp4 = add i32 %tmp2, -4		; <i32> [#uses=1]
	br label %bb
bb:		; preds = %bb.preheader, %bb
	%iv.5 = phi i32 [ %tmp4, %bb.preheader ], [ %iv.5.inc, %bb ]		; <i32>  
	%iv.57 = inttoptr i32 %iv.5 to i32*		; <i32*> [#uses=1]
	store i32 0, i32* %iv.57, align 4
	%iv.5.inc = add i32 %iv.5, -4		; <i32> [#uses=1]
	br i1 %exitcond, label %return, label %bb
return:		; preds = %bb, %entry
	ret void

Wow, that's horrible!  No wonder basicaa gets confused, I don't blame  
it.  However, none of this is needed.  It would be much better for LSR  
to lower the code into:

    %tmp4 = getelementptr [100 x i32]* @A, i32 0, i32 %x   ; i32*
    br label %bb
    %iv.5 = phi i32* [tmp4], [iv.5.inc]
    store i32 0, i32* iv.5
    %iv.5.inc = getelementptr i32* %iv.5, i32 -1
	br i1 %exitcond, label %return, label %bb
return:		; preds = %bb, %entry
	ret void

With this code, basicaa will have little problem understanding what is  
going on, and generating this from LSR should not be very hard at  
all.  Better yet, no new crazy infrastructure change is required.

What do you think?


More information about the llvm-dev mailing list