[LLVMdev] Tail call optimization deeds

Evan Cheng evan.cheng at apple.com
Mon Aug 13 14:12:12 PDT 2007


Hi Arnold and Anton,

Sorry I have been ignoring your emails on this topic. It's an  
important task and I really need sometime to think about it (and talk  
to Chris about it!) But this has been an especially hectic week. I am  
also going to vacation soon so I am not sure when I would get around  
to it.

If Chris has time, I am sure he has lots to say on this topic. :-)  
Otherwise, please be patient. :-)

Evan

On Aug 11, 2007, at 1:25 PM, Arnold Schwaighofer wrote:

> Okay so i implemented an(other :) initial version for X86-32 backend,
> this time based on TOT:
>
> It is not very generic at the moment. Only functions with
> callingconv::fastcc and the tail call attribute will be optimized.
> Maybe the next step should be to integrate the code into the other
> calling convention lowering. Here is what i have at the moment:
>
> If callingconv::fastcc is used the function will be custom lowered.
> (LowerX86_32FastCCCallTo and LowerX86_32FastCCArguments, the code is
> based on the std calling convention minus ecx as inreg)
> The lowering code decides whether the function call really is
> eligible for tail call optimization tco (caller-callee agree, tail
> call opt is enabled, tail position, TODO: check for PIC code).
> If it is okay to do tco it calculates the stack adjustment.
> Repositions the return address,
> Puts the arguments on the correct position relative to the (virtual)
> frame pointer.
> Moves the function address to register ECX if the tailcallee is in a
> register. (TODO: set ECX as live out?)
> TODO: check what to do with pic code
> Adds register arguments (EAX, EDX) as live in if they are used for
> arguments.
> Create a TAILCALL node (a pseudo instruction) holding the Callee
> (Register or TargetGlobalAddress/ExternalSymbol), the stack
> adjustment as a constant, and the live-in registers. The
> TargetLowering class has a IsLastCallTailCall state field which is
> set to true such that the LowerRET function nows it has to produce
> different code. The TAILCALL node is then returned.
>
> If it is not okay the code produces a function call like the std
> calling convention (callee pops arguments on return).
> There is one difference though: since tail call functions might
> arbitrarily change stack alignment (since they adjust the stack by
> the difference of the argument stack size) the stack adjustment (by
> that i mean the number of bytes a function requires as its stack
> size) i when i calculate that stack size/ stack adjustment i take
> care that the stack sizes are always multiples of the targets stack
> alignment. (this code at the moment can be turned off by an option
> eg.:if it's known that no dynamically linked functions are called).
> I observed the behavior of dynamically linked functions requiring 16
> byte stack alignment on mac os x. This might also render it difficult
> to do general tail call optimization (for normal std call, c call
> functions it would only bem possible if the stack size difference is
> a multiple of the stack alignment). But maybe i am misinterpreting
> the situation. (TODO: look into that:)
>
> The lowering of arguments is pretty similar to  std call.
>
> The LowerRET function checks for the IsLastCallTailCall field and if
> its true does the following otherwise it does its usual duty.
> It looks for the TAILCALL node and uses the operand holding the
> callee and the stack adjustment to build and return a TC_RETURN node
> with preceeding operands as its own operands.
>
> The TAILCALL node is converted to a noop machine instruction.
> Ditto the TC_RETURN node. (it declares the iscall=1, isbarrier=1,
> isret=1)
>
> The machine instructions needed to adjust the stack and jump to the
> callee is added in the epilogue generating function in
> X86RegisterInfo. It uses the two operands of the TC_RETURN node for
> the required information.
>
>
> So modifications are:
> a LowerX86_32FastCCCallTo function
> a LowerX86_32FastCCArguments function
> change to LowerRET to generate the TC_RETURN node
> some code at the end of X86RegisterInfo to do stack adjustment/
> jumping to tailcallee (like the EH_RETURN code, thanks Anton for the
> pointers!)
> small modifications to X86InstrInfo.td
> (a yes and at the moment i remove generating TAILCALL nodes in the
> other functions - LowerCCCall and the like)
>
> Another TODO: look a floating point code to see whether there are any
> kinks. Look at other calling conventions to see what is possible
> there. (see Anton's remarks)
>
> I am away now for 2 weeks (isle of cres - camping). but after that i
> will be looking into further improving the code (and also look at the
> other architectures).
> Or maybe i have answers telling me that i am going into a totally
> wrong direction , and i should let the experts do this stuff and stop
> nagging people :).
>
> regards arnold
>
>
>> need to formulate conditions of the call to be "suitable for tail
>> call". On x86 it's the following:
>> 1. The calling conventions of b and c are "compatible". There are two
>> scenarios: caller cleans the stack (C calling convention), callee
>> cleans
>> the stack (stdcall calling convention). The call is not suitable for
>> tail call, if b and c cleans the stack differently. For example,  
>> let b
>> has C CC and c - stdcall CC. In this case making tail call results to
>> double cleaning (on in a and one in c), which is incorrect.
>> 2. The same situtation is for struct return functions: the caller
>> pushes
>> extra pointer to return value, but callee is required to clean this
>> pointer out of the stack. So, call b->c is "suitable for tail call"
>> iff
>> both b and c are struct return.
>> 3. Various tricky cases with FP. I don't remeber correctly, but there
>> can be some problems with functions returning FP values.
>> 4. PIC. %ebx should be live in order to make a call via PLT.
>> Unfortunately, %ebx is callee-saved, that's why we can assemble tail
>> calls only to functions within the same module (PLT is not required).
>> 5. %ecx is livein for regparm(3) functions.
>> ... (maybe something I forget)
>>
>> --
>> WBR, Anton Korobeynikov
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list