[LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets
John Brawn
john.brawn at arm.com
Tue Jan 13 02:55:30 PST 2015
Forcing tailcalls, even when it isn’t profitable in terms of performance (or space in the case of
-Os, though I can’t off the top of my head think of any case where a faster tail call would also
be larger), is what the -tailcallopt option is for: see
http://llvm.org/docs/CodeGenerator.html#tail-call-optimization
John
From: bruce.hoult at gmail.com [mailto:bruce.hoult at gmail.com] On Behalf Of Bruce Hoult
Sent: 13 January 2015 00:01
To: John Brawn
Subject: Re: [LLVMdev] [RFC] [PATCH] add tail call optimization to thumb1-only targets
On Tue, Jan 13, 2015 at 3:50 AM, John Brawn <john.brawn at arm.com> wrote:
> During epilog generation, spill register restoring will be done within
> the emit epilogue function.
> If LR happens to be spilled on the stack by the prologue, it's restored
> by use of a scratch register
> just before restoring the other registers.
POP is 1+N cycles whereas LDR is 2 cycles. If we need to LDR lr from the
stack then POP r4 then that's 2 (LDR) + 1+1 (POP) + 1 (MOV to lr) + 1
(ADD sp) = 6 cycles, but a POP {r4,lr} is just 3 cycles.
You appear to be using speed as the figure of merit, but that is not the point of tail call optimisation (except incidentally).
TCO is to minimise the use of precious stack space, and in fact to allow certain algorithms and program transformations to run in constant stack space.
If a programmer assumes that TCO is available and writes their program using continuation-passing style, and then TCO does not actually happen, that is a correctness issue and the program will overflow the stack and crash very quickly.
A few register loads or spills is a distant second consideration.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150113/5d9b1893/attachment.html>
More information about the llvm-dev
mailing list