[LLVMdev] RFC: Tail call optimization X86
Evan Cheng
evan.cheng at apple.com
Mon Sep 24 14:37:03 PDT 2007
On Sep 24, 2007, at 2:25 AM, Arnold Schwaighofer wrote:
>
> On 24 Sep 2007, at 09:18, Evan Cheng wrote:
>> +; RUN: llvm-as < %s | llc -march=x86 -mattr=+sse2 -stats -info-
>> output-file - | grep asm-printer | grep 9
>> +; change preceeding line form ... | grep 8 to ..| grep 9 since
>> +; with new fastcc has std call semantics causing a stack adjustment
>> +; after the function call
>>
>> Not sure if I understand this. Can you illustrate with an example?
>
> Sure
>
> The code generated used to be
> _array:
> subl $12, %esp
> movss LCPI1_0, %xmm0
> mulss 16(%esp), %xmm0
> movss %xmm0, (%esp)
> call L_qux$stub
> mulss LCPI1_0, %xmm0
> addl $12, %esp
> ret
>
> FastCC use to be caller pops arguments so there was no stack
> adjustment after the
> call to qux. Now FastCC has callee pops arguments on return semantics
> so the
> x86 backend inserts a stack adjustment after the call.
>
> _array:
> subl $12, %esp
> movss LCPI1_0, %xmm0
> mulss 16(%esp), %xmm0
> movss %xmm0, (%esp)
> call L_qux$stub
> subl $4, %esp << stack adjustment because qux pops
> arguments on return
> mulss LCPI1_0, %xmm0
> addl $12, %esp
> ret $4
Ok. Please make sure for now this only takes effect if tail call
optimization is turned on though. Also, we need to fold the subl into
addl if possible.
>
>
>> Index: include/llvm/Target/TargetLowering.h
>> ===================================================================
>> --- include/llvm/Target/TargetLowering.h (revision 42247)
>> +++ include/llvm/Target/TargetLowering.h (working copy)
>> @@ -851,8 +851,18 @@
>> virtual std::pair<SDOperand, SDOperand>
>> LowerCallTo(SDOperand Chain, const Type *RetTy, bool
>> RetTyIsSigned,
>> bool isVarArg, unsigned CallingConv, bool isTailCall,
>> - SDOperand Callee, ArgListTy &Args, SelectionDAG &DAG);
>> + bool isNextInstRet, SDOperand Callee, ArgListTy &Args,
>> + SelectionDAG &DAG);
>> + // IsEligibleForTailCallElimination - Check whether the call is
>> eligible for
>> + // tailcall elimination
>> + virtual bool IsEligibleForTailCallElimination(SelectionDAG& DAG,
>> + bool IsNextInstRet,
>> + SDOperand Callee,
>> + unsigned CalleeCC)
>> const {
>> + return false;
>> + }
>> +
>>
>> I am not too keen on "isNextInstRet". We should never use instruction
>> ordering before scheduling to determine whether it's possible to
>> perform an optimization. IsEligibleForTailCallElimination() should
>> determine the feasibility on its own, no?
>
> My assumption:
> Tail call optimization is performed if the instruction following the
> call is
> a return and the return uses the result of the call or is a void
> return.
Ok. But using whether the next instruction is a return at this time is
too restrictive. It's possible there may be instructions that are
between the call and the return that do not interfere with tail call
optimization.
>
>
> The problem is that at the point (in TargetLowering:LowerCallTo) where
> IsEligibleForTailCallElimination is called the SDOperand node for the
> following instructions (we are looking for a return) has not been
> build.
> So IsEligibleForTailCallElimination can't determine - using
> 'Callee' -
> whether the instruction following the call is a return. That is the
> reason
> why i pass the IsNextInstRet flag to TargetLowering::LowerCallTo.
Right. But I think the right way to deal with this is to assume all
the tail calls are eligible for tail call optimizations at this point.
Then checks for eligibility and fix the call nodes after all the
instructions are lowered.
Evan
>
>
> TargetLowering::LowerCallTo(SDOperand Chain, const Type *RetTy,
> bool RetTyIsSigned, bool isVarArg,
> unsigned CallingConv, bool isTailCall,
> bool isNextInstRet, SDOperand Callee,
> ArgListTy &Args, SelectionDAG &DAG) {
> SmallVector<SDOperand, 32> Ops;
> Ops.push_back(Chain); // Op#0 - Chain
> Ops.push_back(DAG.getConstant(CallingConv, getPointerTy())); //
> Op#1 - CC
> Ops.push_back(DAG.getConstant(isVarArg, getPointerTy())); //
> Op#2 - VarArg
> if (isTailCall &&
> IsEligibleForTailCallElimination(DAG, isNextInstRet, Callee,
> CallingConv))
> Ops.push_back(DAG.getConstant(true, getPointerTy())); // Op#3 -
> Tail
>
> The instructions i am refering here to are LLVM IR instructions not
> target machine instructions. I could remove isNextInstRet from
> IsEligibleForTailCallElimination because in
> TargetLowering::LowerCallTo
> it is clear that we can't do tail call optimization if isNextInstRet
> (this time coming from
> SelectionDAGLowering::LowerCallTo) is false.
>
> if (isTailCall && isNextInstRet &&
> IsEligibleForTailCallElimination(DAG,, Callee, CallingConv))
>
> Note that I interpreted the following paragraph of you:
>> IsEligibleForTailCallElimination() should be a target hook. This way
>> TargetLowering::LowerCallTo() can determine whether a call is
>> eligible for tail call elimination and insert the current ISD::CALL
>> node. X86TargetLowering::LowerX86_32FastCCCallTo() will not have to
>> handle non-tail calls.
>
> in the following way:
> I check in TargetLowering:LowerCallTo whether the call
> IsEligibleForTailCallElimination
> and insert a ISD:CALL node with the tailcall attribute set or not.
>
> Then in X86TargetLowering::LowerCALL i can dispatch the right
> lowering code:
>
>> SDOperand X86TargetLowering::LowerCALL(SDOperand Op, SelectionDAG
>> &DAG) {
>> unsigned CallingConv = cast<ConstantSDNode>(Op.getOperand(1))-
>>> getValue();
>> bool isTailCall = cast<ConstantSDNode>(Op.getOperand(3))->getValue
>> () != 0;
>>
>> case CallingConv::Fast:
>> if (isTailCall && PerformTailCallOpt)
>> return LowerX86_TailCallTo(Op, DAG, CallingConv);
>> else
>> return LowerCCCCallTo(Op,DAG, CallingConv);
>
>
>> Some stylistic nitpicks. Please write the comments as:
>> /// IsNextInstructionReturn - Check whether..
> Will do.
>> + assert(BI != BB->end() && "Woohooa");
>> Better assertion messages please. :-)
>>
>> Why not write it like this:
>>
> okay
>> Also, shouldn't this function be "static"?
> okay
>> Please fix the inconsistency: "TAILCALL" vs. "TAIL CALL".
>>
> okay
>> +SDOperand GetPossiblePreceedingTailCall(SDOperand Chain) {
>>
>> This should be "static"?
>
> okay
>> +
>> + Chain = DAG.getNode( X86ISD::CALL,
>> NodeTys, &Ops[0], Ops.size());
>>
>> Stylistic nitpick: please merge 2 lines and remove the space after
>> '('.
> okay
>> - Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
>> - Flag = Chain.getValue(1);
>> + Chain = DAG.getNode(X86ISD::FP_SET_RESULT, Tys, Ops, 2);
>> + Flag = Chain.getValue(1);
>>
>> What happened here?
> Ups corrected.
>>
>> +// Check to see whether the next instruction following the call is a
>> return
>> +// A function is eligible if caller/callee calling conventions
>> match, currently
>> +// only fastcc supports tail calls, and the function CALL is
>> immediatly followed
>> +// by a RET
>> +bool X86TargetLowering::IsEligibleForTailCallElimination
>> (SelectionDAG& DAG,
>> + bool
>> IsNextInstRet,
>> + SDOperand
>> Callee,
>> + unsigned
>> CalleeCC)
>> + const {
>>
>> Having "const {" on a separate line looks funky. :-)
>>
> will change ;)
>> Please remove code that's commented out.
>
>> if( G ==> if (G :-)
>
>> No need to set IsEligible to false since it's initialized to false.
>>
> ;) okay dokey
>
> regards
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list