[llvm-commits] [llvm] r55807 - in /llvm/trunk: lib/Target/X86/X86CallingConv.td lib/Target/X86/X86ISelLowering.cpp test/CodeGen/X86/2007-08-13-SpillerReuse.ll test/CodeGen/X86/2008-02-22-ReMatBug.ll test/CodeGen/X86/coalescer-commute3.ll test/Cod

Mon Sep 22 02:31:22 PDT 2008

On Mon, Sep 22, 2008 at 9:54 AM, Duncan Sands <baldrick at free.fr> wrote:
>> > Does this mean that "PerformTailCallOpt" changes the ABI?
>
> Arnold already agreed that tailcall could use the new calling convention,
> i.e. be the same as fastcc.
>
>> First of all, the comment about "nested function not being supported
>> by fastcc" looks wrong.
>
> It is correct.  The problem is that the tailcall logic reserves a
> register (ECX; would be EAX in new fastcc) for its own use.  The
> trampoline stuff wants that same register to do trampoline stuff
> (it's the only spare register).  If the two calling conventions are
> merged (i.e. tailcall cc dropped), there would be the problem of
> detecting that tailcall and trampoline are fighting for the same
> register, so an error can be issued.  I think the right solution is
> to explicitly represent in the td that tailcall wants the register,
> in much the same way as trampoline stuff does.  Something like this:
Yes as duncan says tail call optimization requires a (caller saved)
register for a function call via a pointer. That is why tailcall
convention differs from the others. I initially created it by copying
the std convention and modifying it. That's were the wrong comment
regarding the three registers comes from it should actually read two
registers are available. There actually is no reason that prevents
CC_X86_32_TailCall from being replaced by CC_X86_32_FastCC. (eax would
be used as the function pointer register, also see discussion
<http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20080908/067075.html>).
I wanted to change that but have not come around to do it yet.

BUT there still is an ABI difference when -tailcallopt is enabled.

To support general tail call optimization (in cases where the callee
can require more room for its arguments than the caller provides) the
calling convention is changed from caller pops arguments to callee
pops arguments. This is necessary as the caller does not know how many
arguments there are to pop in cases where the called function performs
a tail call itself. (caller() calls callee(a,b) tailcalls
calleecallee(a,b,c,d), how many arguments are there to pop for caller
2 or 4)

So in cases where a library would be compiled with -tailcallopt
enabled and a the program using the library would be compiled without
-tailcallopt all hell would break loose.

Possible solutions:
- Document this fact! ;)
- Only optimize sibling tail calls (callee argument area <= caller
argument area, calller pops arguments can be used) with fastcc calls
and create a new calling convention TailCall which is always 'callee
pops arguments' even if -tailcallopt is not enabled.

regards arnold