[LLVMdev] RFC: Tail call optimization X86

Arnold Schwaighofer arnold.schwaighofer at gmail.com
Mon Sep 24 02:25:34 PDT 2007


On 24 Sep 2007, at 09:18, Evan Cheng wrote:
> +; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 -stats -info-
> output-file - | grep asm-printer | grep 9
> +; change preceeding line form ... | grep 8 to ..| grep 9 since
> +; with new fastcc has std call semantics causing a stack adjustment
> +; after the function call
>
> Not sure if I understand this. Can you illustrate with an example?

Sure

The code generated used to be
_array:
         subl    $12, %esp
         movss   LCPI1_0, %xmm0
         mulss   16(%esp), %xmm0
         movss   %xmm0, (%esp)
         call    L_qux$stub
         mulss   LCPI1_0, %xmm0
         addl    $12, %esp
         ret

FastCC use to be caller pops arguments so there was no stack  
adjustment after the
call to qux. Now FastCC has callee pops arguments on return semantics  
so the
x86 backend inserts a stack adjustment after the call.

_array:
  	subl    $12, %esp
         movss   LCPI1_0, %xmm0
         mulss   16(%esp), %xmm0
         movss   %xmm0, (%esp)
         call    L_qux$stub
         subl    $4, %esp   << stack adjustment because qux pops  
arguments on return
         mulss   LCPI1_0, %xmm0
         addl    $12, %esp
         ret     $4

> Index: include/llvm/Target/TargetLowering.h
> ===================================================================
> --- include/llvm/Target/TargetLowering.h	(revision 42247)
> +++ include/llvm/Target/TargetLowering.h	(working copy)
> @@ -851,8 +851,18 @@
>     virtual std::pair<SDOperand, SDOperand>
>     LowerCallTo(SDOperand Chain, const Type *RetTy, bool  
> RetTyIsSigned,
>                 bool isVarArg, unsigned CallingConv, bool isTailCall,
> -              SDOperand Callee, ArgListTy &Args, SelectionDAG &DAG);
> +              bool isNextInstRet, SDOperand Callee, ArgListTy &Args,
> +              SelectionDAG &DAG);
> +  // IsEligibleForTailCallElimination - Check whether the call is
> eligible for
> +  // tailcall elimination
> +  virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG,
> +                                                bool IsNextInstRet,
> +                                                SDOperand Callee,
> +                                                unsigned CalleeCC)
> const {
> +    return false;
> +  }
> +
>
> I am not too keen on "isNextInstRet". We should never use instruction
> ordering before scheduling to determine whether it's possible to
> perform an optimization. IsEligibleForTailCallElimination() should
> determine the feasibility on its own, no?

My assumption:
Tail call optimization is performed if the instruction following the  
call is
a return and the return uses the result of the call or is a void return.

The problem is that at the point (in TargetLowering:LowerCallTo) where
IsEligibleForTailCallElimination is called the SDOperand node for the
following instructions (we are looking for a return) has not been build.
So IsEligibleForTailCallElimination can't determine - using 'Callee'  -
whether the instruction following the call is a return. That is the  
reason
why i pass  the IsNextInstRet flag to TargetLowering::LowerCallTo.

TargetLowering::LowerCallTo(SDOperand Chain, const Type *RetTy,
                             bool RetTyIsSigned, bool isVarArg,
                             unsigned CallingConv, bool isTailCall,
                             bool isNextInstRet, SDOperand Callee,
                             ArgListTy &Args, SelectionDAG &DAG) {
   SmallVector<SDOperand, 32> Ops;
   Ops.push_back(Chain);   // Op#0 - Chain
   Ops.push_back(DAG.getConstant(CallingConv, getPointerTy())); //  
Op#1 - CC
   Ops.push_back(DAG.getConstant(isVarArg, getPointerTy()));    //  
Op#2 - VarArg
   if (isTailCall &&
       IsEligibleForTailCallElimination(DAG, isNextInstRet, Callee,  
CallingConv))
     Ops.push_back(DAG.getConstant(true, getPointerTy())); // Op#3 -  
Tail

The instructions i am refering here to are LLVM IR instructions not
target machine instructions. I could remove isNextInstRet from
IsEligibleForTailCallElimination because in TargetLowering::LowerCallTo
it is clear that we can't do tail call optimization if isNextInstRet  
(this time coming from
SelectionDAGLowering::LowerCallTo) is false.

if (isTailCall && isNextInstRet &&
       IsEligibleForTailCallElimination(DAG,, Callee, CallingConv))

Note that I interpreted the following paragraph of you:
> IsEligibleForTailCallElimination() should be a target hook. This way
> TargetLowering::LowerCallTo() can determine whether a call is
> eligible for tail call elimination and insert the current ISD::CALL
> node.  X86TargetLowering::LowerX86_32FastCCCallTo() will not have to
> handle non-tail calls.

in the following way:
I check in TargetLowering:LowerCallTo whether the call  
IsEligibleForTailCallElimination
and insert a ISD:CALL node with the tailcall attribute set or not.

Then in X86TargetLowering::LowerCALL i can dispatch the right  
lowering code:

> SDOperand X86TargetLowering::LowerCALL(SDOperand Op, SelectionDAG  
> &DAG) {
>   unsigned CallingConv = cast<ConstantSDNode>(Op.getOperand(1))- 
> >getValue();
>   bool isTailCall = cast<ConstantSDNode>(Op.getOperand(3))->getValue 
> () != 0;
>
>   case CallingConv::Fast:
>       if (isTailCall && PerformTailCallOpt)
>         return LowerX86_TailCallTo(Op, DAG, CallingConv);
>       else
>         return LowerCCCCallTo(Op,DAG, CallingConv);


> Some stylistic nitpicks. Please write the comments as:
> /// IsNextInstructionReturn - Check whether..
Will do.
> +  assert(BI != BB->end() && "Woohooa");
> Better assertion messages please. :-)
>
> Why not write it like this:
>
okay
> Also, shouldn't this function be "static"?
okay
> Please fix the inconsistency: "TAILCALL" vs. "TAIL CALL".
>
okay
> +SDOperand GetPossiblePreceedingTailCall(SDOperand Chain) {
>
> This should be "static"?

okay
> +
> +  Chain = DAG.getNode( X86ISD::CALL,
>                         NodeTys, &Ops[0], Ops.size());
>
> Stylistic nitpick: please merge 2 lines and remove the space after  
> '('.
okay
> -    Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
> -    Flag = Chain.getValue(1);
> +      Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
> +      Flag = Chain.getValue(1);
>
> What happened here?
Ups corrected.
>
> +// Check to see whether the next instruction following the call is a
> return
> +// A function is eligible if caller/callee calling conventions
> match, currently
> +// only fastcc supports tail calls, and the function CALL is
> immediatly followed
> +// by a RET
> +bool X86TargetLowering::IsEligibleForTailCallElimination
> (SelectionDAG& DAG,
> +                                                         bool
> IsNextInstRet,
> +                                                         SDOperand
> Callee,
> +                                                         unsigned
> CalleeCC)
> +  const {
>
> Having "const {" on a separate line looks funky. :-)
>
will change ;)
> Please remove code that's commented out.

> if( G  ==> if (G  :-)

> No need to set IsEligible to false since it's initialized to false.
>
;) okay dokey

regards



More information about the llvm-dev mailing list