[LLVMdev] Question about fastcc assumptions and seemingly superfluous %esp updates

Arnold Schwaighofer aschwaighofer at apple.com
Thu Feb 14 17:30:31 PST 2013


When you enable -tailcallopt you get support for tail calls between functions with arbitrary stack space requirements. That means the calling convention has to change slightly. E.g the callee is responsible for removing it's arguments of the stack. The caller cannot transitively know the tail callee's tailcallee's requirement. Also care must be taken to make sure the stack stays aligned.

On Feb 14, 2013, at 4:45 PM, Eli Bendersky <eliben at google.com> wrote:

> Hello,
> 
> While investigating one of the existing tests
> (test/CodeGen/X86/tailcallpic2.ll), I ran into IR that produces some
> interesting code. The IR is very straightforward:
> 
> define protected fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {
> entry:
> ret i32 %a3
> }
> 
> define fastcc i32 @tailcaller(i32 %in1, i32 %in2) {
> entry:
> %tmp11 = tail call fastcc i32 @tailcallee( i32 %in1, i32 %in2, i32
> %in1, i32 %in2)
> ret i32 %tmp11
> }
> 
> define i32 @foo(i32 %in1, i32 %in2) {
> entry:
>  %q = call fastcc i32 @tailcaller(i32 %in2, i32 %in1)
>  %ww = sub i32 %q, 6
>  ret i32 %ww
> }
> 
> Built with (ToT LLVM):
> llc < ~/temp/z.ll  -march=x86 -tailcallopt -O3
> 
> The produced code is (cleaned up a bit)
> 
> tailcallee:                             # @tailcallee
>  movl  4(%esp), %eax
>  ret  $12
> 
> tailcaller:                             # @tailcaller
>  subl  $12, %esp
>  movl  %edx, 20(%esp)
>  movl  %ecx, 16(%esp)
>  addl  $12, %esp
>  jmp  tailcallee              # TAILCALL
> 
> foo:                                    # @foo
>  subl  $12, %esp
>  movl  20(%esp), %ecx
>  movl  16(%esp), %edx
>  calll  tailcaller
>  subl  $12, %esp
>  addl  $-6, %eax
>  addl  $12, %esp
>  ret
> 
> A number of questions arise here:
> 
> 1) Notice that 'tailcaller' goes beyond its own stack frame when
> arranging arguments for 'tailcallee'. It subs 12 from %esp, but then
> writes to 20(%esp). Clearly, something in the fastcc convention allows
> it to assume that stack space will be available there? What is it?
> 
It writes to its incoming parameter space. When you tail call your callers outgoing parameter area is your outgoing parameter area. Now you are going to say: Wait a minute the required space for tail caller is zero! And you are right, there is a bug in the code that computes the required space (which needs to be 16 byte aligned): see X86ISelLowering::GetAlignedArgumentStackSize

In general, the tail caller knows that there is space because it was called and its parameters were put there (plus some empty space to keep the stack 16byte aligned). To keep the stack aligned the parameter area changes in increments of 16. There was a bug (apparently, i did not upstream the fix for it :( ) in the may we calculate this "adjustment" that would cause a 0 stack space to pump up the alignment by 12 (16 - return addr). It is safe (because we are calling this function consistently), though we are wasting stack space.

When we do the tail we ask: what is the required stack space of the caller and what is the required stack space of the callee. We subtract them. If the subtraction ends up to be zero you can just move the arguments, otherwise you have to adjust the stack(frame), possibly moving the return address around.

tailcaller f(i32,i32) -> bug: GetAlignedArgumentStackSize returns that there is space for three arguments on the stack (as you can see in foo this space was really allocated).

tailcallee f(i32,i32,i32,i32) -> also has space for three arguments on the stack

And there is another bug that causes the code to assume a tail call is a normal call and as such you end up with

tailcaller:                             # @tailcaller
 subl  $12, %esp
...
 addl  $12, %esp
 jmp  tailcallee   

Again safe but not very efficient :).

> 2) Note the %esp dance 'tailcaller' is doing - completely useless sub
> followed by add. Does this have an inherent goal or can it be
> eliminated?
> 
> 3) The %esp dance of 'foo' is even stranger:
> 
>  subl  $12, %esp
>  addl  $-6, %eax
>  addl  $12, %esp

Because of what I said in the beginning if a non fastcc function calls a fastcc function with tailcallopt on, it has do readjust the stack because the fastcc tail called function popped its arguments off the stack. Imagine you had two such call sites in a row.

> 
> The subl and addl to %esp cancel out, and with an unrelated operation
> in between. Why are they needed?
> 
> I'll be very grateful if someone could shed some light on this.
> 
> Eli
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


After rummaging around on one of my old machines I found this patch. I most have forgotten to commit this (this is from 2011 so I probably does not apply cleanly anymore).

-------------- next part --------------
A non-text attachment was scrubbed...
Name: tailcall_stack.patch
Type: application/octet-stream
Size: 2573 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130214/a1bcf802/attachment.obj>


More information about the llvm-dev mailing list