[llvm-dev] Memory Store/Load Optimization Issue (Emulating stack)

Tue Feb 9 15:26:24 PST 2016

Two points:
- Using inttoptr is a mistake here.  GEPs are strongly preferred and 
provide strictly more aliasing information to the optimizer.
- The zext is a bit weird.  I'm not sure where that came from, but I'd 
not bother looking into until the preceding point is addressed.

In general, you may find these docs useful:
http://llvm.org/docs/Frontend/PerformanceTips.html

Philip


On 02/08/2016 06:54 AM, Paul Peet via llvm-dev wrote:
> Hello,
>
> I am trying to emulate the "stack" as like on x86 when using push/pop 
> so afterwards I can use LLVM's optimizer passes to simplify (reduce 
> junk) the code.
>
> The LLVM IR code:
>
> define { i32, i32, i32 } @test(i32 %foo, i32 %bar, i32 %sp) {
>   ; push foo (On "stack")
>   %sp_1 = sub i32 %sp, 4
>   %sp_1_ptr = inttoptr i32 %sp_1 to i32*
>   store i32 %foo, i32* %sp_1_ptr, align 4
>
>   ; push bar
>   %sp_2 = sub i32 %sp_1, 4
>   %sp_2_ptr = inttoptr i32 %sp_2 to i32*
>   store i32 %bar, i32* %sp_2_ptr, align 4
>
>   ; val1 = pop (val1 = bar)
>   %sp_3_ptr = inttoptr i32 %sp_2 to i32*
>   %val1 = load i32, i32* %sp_3_ptr, align 4
>   %sp_3 = add i32 %sp_2, 4
>
>   ; val2 = pop (val2 = foo)
>   %sp_4_ptr = inttoptr i32 %sp_3 to i32*
>   %val2 = load i32, i32* %sp_4_ptr, align 4
>   %sp_4 = add i32 %sp_3, 4
>
>   %ret_1 = insertvalue { i32, i32, i32 } undef, i32 %val1, 0
>   %ret_2 = insertvalue { i32, i32, i32 } %ret_1, i32 %val2, 1
>   %ret_3 = insertvalue { i32, i32, i32 } %ret_2, i32 %sp_4, 2
>
>   ret { i32, i32, i32 } %ret_3
> }
>
> This code will "push" two values onto the stack and pop them in 
> reverse order so afterwards "foo" and "bar" will be swapped and 
> returned back.
>
> After running this through "opt -O2 ./test.ll", I am getting this:
>
> define { i32, i32, i32 } @test(i32 %foo, i32 %bar, i32 %sp) #0 {
>   %sp_1 = add i32 %sp, -4
>   %1 = zext i32 %sp_1 to i64
>   %sp_1_ptr = inttoptr i64 %1 to i32*
>   store i32 %foo, i32* %sp_1_ptr, align 4
>   %sp_2 = add i32 %sp, -8
>   %2 = zext i32 %sp_2 to i64
>   %sp_2_ptr = inttoptr i64 %2 to i32*
>   store i32 %bar, i32* %sp_2_ptr, align 4
>   %val2 = load i32, i32* %sp_1_ptr, align 4
>   %ret_1 = insertvalue { i32, i32, i32 } undef, i32 %bar, 0 ; Swapped
>   %ret_2 = insertvalue { i32, i32, i32 } %ret_1, i32 %val2, 1; Not 
> Swapped (Not optimized; Should be %foo)
>   %ret_3 = insertvalue { i32, i32, i32 } %ret_2, i32 %sp, 2
>   ret { i32, i32, i32 } %ret_3
> }
>
> As you can see that the IR has got additional code, eg. zext. But the 
> main problem here is that val2 hasn't been optimized.
> Could anyone show me some hints what is preventing the second val from 
> being optimized? (My guess would be the zext because I am using %sp as 
> a 32bit pointer although the "target" is 64bit).
>
> Regards,
> Paul
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160209/cb973f5d/attachment.html>