<div dir="ltr">Hello,<div><br></div><div>I am trying to emulate the "stack" as like on x86 when using push/pop so afterwards I can use LLVM's optimizer passes to simplify (reduce junk) the code.</div><div><br></div><div>The LLVM IR code:</div><div><br></div><div><div>define { i32, i32, i32 } @test(i32 %foo, i32 %bar, i32 %sp) {</div><div>  ; push foo (On "stack")</div><div>  %sp_1 = sub i32 %sp, 4</div><div>  %sp_1_ptr = inttoptr i32 %sp_1 to i32*</div><div>  store i32 %foo, i32* %sp_1_ptr, align 4</div><div><br></div><div>  ; push bar</div><div>  %sp_2 = sub i32 %sp_1, 4</div><div>  %sp_2_ptr = inttoptr i32 %sp_2 to i32*</div><div>  store i32 %bar, i32* %sp_2_ptr, align 4</div><div><br></div><div>  ; val1 = pop (val1 = bar)</div><div>  %sp_3_ptr = inttoptr i32 %sp_2 to i32*</div><div>  %val1 = load i32, i32* %sp_3_ptr, align 4</div><div>  %sp_3 = add i32 %sp_2, 4</div><div><br></div><div>  ; val2 = pop (val2 = foo)</div><div>  %sp_4_ptr = inttoptr i32 %sp_3 to i32*</div><div>  %val2 = load i32, i32* %sp_4_ptr, align 4</div><div>  %sp_4 = add i32 %sp_3, 4</div><div><br></div><div>  %ret_1 = insertvalue { i32, i32, i32 } undef, i32 %val1, 0</div><div>  %ret_2 = insertvalue { i32, i32, i32 } %ret_1, i32 %val2, 1</div><div>  %ret_3 = insertvalue { i32, i32, i32 } %ret_2, i32 %sp_4, 2</div><div><br></div><div>  ret { i32, i32, i32 } %ret_3</div><div>}</div></div><div><br></div><div>This code will "push" two values onto the stack and pop them in reverse order so afterwards "foo" and "bar" will be swapped and returned back.</div><div><br></div><div>After running this through "opt -O2 ./test.ll", I am getting this:</div><div><br></div><div><div>define { i32, i32, i32 } @test(i32 %foo, i32 %bar, i32 %sp) #0 {</div><div>  %sp_1 = add i32 %sp, -4</div><div>  %1 = zext i32 %sp_1 to i64</div><div>  %sp_1_ptr = inttoptr i64 %1 to i32*</div><div>  store i32 %foo, i32* %sp_1_ptr, align 4</div><div>  %sp_2 = add i32 %sp, -8</div><div>  %2 = zext i32 %sp_2 to i64</div><div>  %sp_2_ptr = inttoptr i64 %2 to i32*</div><div>  store i32 %bar, i32* %sp_2_ptr, align 4</div><div>  %val2 = load i32, i32* %sp_1_ptr, align 4</div><div>  %ret_1 = insertvalue { i32, i32, i32 } undef, i32 %bar, 0 ; Swapped</div><div>  %ret_2 = insertvalue { i32, i32, i32 } %ret_1, i32 %val2, 1; Not Swapped (Not optimized; Should be %foo)</div><div>  %ret_3 = insertvalue { i32, i32, i32 } %ret_2, i32 %sp, 2</div><div>  ret { i32, i32, i32 } %ret_3</div><div>}</div></div><div><br></div><div>As you can see that the IR has got additional code, eg. zext. But the main problem here is that val2 hasn't been optimized.</div><div>Could anyone show me some hints what is preventing the second val from being optimized? (My guess would be the zext because I am using %sp as a 32bit pointer although the "target" is 64bit).</div><div><br></div><div>Regards,</div><div>Paul</div></div>