[LLVMdev] Tail call optimization thoughts

Thu Aug 9 15:26:19 PDT 2007

On 8/9/07, Anton Korobeynikov <asl at math.spbu.ru> wrote:
> Hello, Arnold.
>
> Only quick comments, I'll try to make a full review a little bit later.
>
> > 0.)a fast calling convention (maybe use the current
> >     CallingConv::Fast, or create a CallingConv::TailCall)
> > 1.) lowering of formal arguments
> >     like for example x86_LowerCCCArguments in  stdcall mode
> >     we need to make sure that later mentioned CALL_CLOBBERED_REG is
> >     not used (remove it from availableregisters in callingconv
> >     for argument passing?)
> This can be acceptable, only if function has internal linkage.
> Otherwise, we cannot change calling convention of the function. So
> extension if this problem:
again i did not express myself clearly -sorry - what i meant is not to
change the calling convention of the function but only do tail call
optimization for functions (caller-callee) that obey a special calling
convention (like for example x86_stdcall convention - with the
exception that x86_stdcall as implemented in the backend currently
allows for params passed in reg eax,ecx and edx, but when used in
future with tailcallopt one of those registers must be reserved for
the tailcallee address resulting in only 2 free registers for param
passing). if both caller and callee obey that "callingconv::fast" -
maximum 2args in register and keep one call clobbered free than do
tailcallopt.

what i also notice (in my first attempt) is that one has do keep the
stack aligned (on mac os x, calls to dynamically loaded functions want
the stack to be 16 byte aligned). when doing tailcalloptimization the
naive way (my 1. try) was to just adjust the stack by the argument
difference. but that could result in messing with the stack alignment.
so one has to round the argument on stack size to 16n+12 such that the
frame pointer difference of two functions is always a multiple of 16
(that is if the stackalignment has to be 16 byte).

> Let us have the following sequence of calls: a->b->c
> Call b->c is tail call.
>
> We need to formulate conditions of the call to be "suitable for tail
> call". On x86 it's the following:
> 1. The calling conventions of b and c are "compatible". There are two
> scenarios: caller cleans the stack (C calling convention), callee cleans
> the stack (stdcall calling convention). The call is not suitable for
> tail call, if b and c cleans the stack differently. For example, let b
> has C CC and c - stdcall CC. In this case making tail call results to
> double cleaning (on in a and one in c), which is incorrect.
i was thinking of just doing tailcallopt with one CC in the beginning
(~stdcall) but yes that is true
> 2. The same situtation is for struct return functions: the caller pushes
> extra pointer to return value, but callee is required to clean this
> pointer out of the stack. So, call b->c is "suitable for tail call" iff
> both b and c are struct return.
> 3. Various tricky cases with FP. I don't remeber correctly, but there
> can be some problems with functions returning FP values.
yes i will have to look a those
> 4. PIC. %ebx should be live in order to make a call via PLT.
> Unfortunately, %ebx is callee-saved, that's why we can assemble tail
> calls only to functions within the same module (PLT is not required).
> 5. %ecx is livein for regparm(3) functions.
i guess you mean functions which have their address loaded to a
register (call via a function pointer) by regparam functions
> ... (maybe something I forget)
... even more that i don't know :)
>
> --
> WBR, Anton Korobeynikov
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>