[LLVMdev] RFC: Tail call optimization X86
Arnold Schwaighofer
arnold.schwaighofer at gmail.com
Fri Oct 5 02:42:44 PDT 2007
Hi Evan,
I incoporated the changes you request but to the following i have got
a question:
> Also, moving the option
> there will allow us to change fastcc ABI (callee popping arguments)
> only when this option is on. See Chris' email:
I am not to sure on that. because that would make modules compiled
with the flag on incompatible with ones compiled without the flag off
as stack behaviour would mismatch.
It would be no problem to make the behaviour dependent on the -tail-
call-opt flag. i am not sure that this is a good idea?
pseudocode:
module 1: with -tailcallopt enabled
fastcc int callee(int arg1, int arg2) {
...
-> onreturn: pops 8 byte
}
module 2: no -tailcallopt
fastcc int caller() {
int x= call fastcc callee();
//!! caller pops the arguments => stack mismatch
callee pops the arguments but caller also wants to pop the arguments
of the stack
Apparently i forgot to send the answer email to chris reponse. sorry
for that.
>
>> Hmmm. Ok. So this is due to X86CallingConv.td changes? Unfortunately
>> that's not controlled by options. Ok then.
>>
>
> Sure it can be, you can set up custom predicates, for example the
> X86CallingConv.td file has:
>
> class CCIfSubtarget<string F, CCAction A>
> : CCIf<!strconcat("State.getTarget().getSubtarget<X86Subtarget>
> ().", F), A>;
>
> It would be straight-forward to have a CCIf defined to check some
> command
> line argument. I think enabling this as llcbeta for a few nights
> makes
> sense before turning it on by default.
No not directly. The code related to "caller/callee cleans arguments
off the stack" is not controlled by the .td. It's controlled in code
by the operands of CALLSEQ_END.
for example in SDOperand X86TargetLowering::LowerCCCCallTo:
...
if (CC == CallingConv::X86_StdCall || CC == CallingConv::Fast) {
if (isVarArg)
NumBytesForCalleeToPush = isSRet ? 4 : 0;
else
NumBytesForCalleeToPush = NumBytes;
assert(!(isVarArg && CC==CallingConv::Fast) &&
"CallingConv::Fast does not support varargs.");
} else {
// If this is is a call to a struct-return function, the callee
// pops the hidden struct pointer, so we have to push it back.
// This is common for Darwin/X86, Linux & Mingw32 targets.
NumBytesForCalleeToPush = isSRet ? 4 : 0;
}
NodeTys = DAG.getVTList(MVT::Other, MVT::Flag);
Ops.clear();
Ops.push_back(Chain);
Ops.push_back(DAG.getConstant(NumBytes, getPointerTy()));
Ops.push_back(DAG.getConstant(NumBytesForCalleeToPush, getPointerTy
()));
Ops.push_back(InFlag);
Chain = DAG.getNode(ISD::CALLSEQ_END, NodeTys, &Ops[0], Ops.size());
InFlag = Chain.getValue(1);
The third operand is the number of bytes the callee pops of the
stack on return (on x86). This gets lowered to a ADJCALLSTACKUP
pseudo machineinstruction.
Later when X86RegisterInfo::eliminateCallFramePseudoInstr is called
and framepointerelimination is enabled the following code gets called:
...
else if (I->getOpcode() == X86::ADJCALLSTACKUP) {
// If we are performing frame pointer elimination and if the
callee pops
// something off the stack pointer, add it back. We do this
until we have
// more advanced stack pointer tracking ability.
if (uint64_t CalleeAmt = I->getOperand(1).getImm()) {
unsigned Opc = (CalleeAmt < 128) ?
(Is64Bit ? X86::SUB64ri8 : X86::SUB32ri8) :
(Is64Bit ? X86::SUB64ri32 : X86::SUB32ri);
MachineInstr *New =
BuildMI(TII.get(Opc), StackPtr).addReg(StackPtr).addImm
(CalleeAmt);
MBB.insert(I, New);
}
}
I am not sure about a command line switch would toggling the stack
adjusting behaviour of a function. Because if the function is called
from a different module which was compiled with the opposite command
line switch all hell would break loose (because it assumes callee
pops arguments when it does not).
More information about the llvm-dev
mailing list