[PATCH] Add tail call optimization for thumb1-only targets

Bjoern Haase bjoern.m.haase at web.de
Thu Jan 15 12:22:49 PST 2015

Hi jmolloy,

For Tail calls identified during DAG generation, the target address will
be loaded into a register by use of the constant pool.
If R3 is used for argument passing, the target address is forced
to hard reg R12 in order to overcome limitations thumb1 register
allocator with respect to the upper registers. 
According to review by Jonathan Roelofs this aspect should be reviewed. He doubts that there might be some risk that the register scavenger might interfere with data in R12.

During epilog generation, spill register restoring will be done within
the emit epilogue function. Three different cases are to be distinguished.

1) If LR is not pushed on the stack. Then simply a BX is generated.
2) If LR is pushed on the stack and R3 is available as scratch, LR is restored after pop { ... } for the remaining callee saved regs.
3) If R3 is not available for LR restore, LR is restored before pop { ... } and the stack pointer is re-adjusted afterwards

For a cortex M0 I did count that the sequence 2) will take one cycle longer than a version based on BL / pop { ..., pc } without a tail call. Option 3 will be 3 cycles slower than a version without a tail call. Both, 2) and 3) cases will generate sligthly larger code (have a look at the test cases).

In discussions on llvm-dev some did argue that for this reason, tail call optimization should not be integrated as part of the default options. In my personal perception the spared precious stack memory is readily worth it.




-------------- next part --------------
A non-text attachment was scrubbed...
Name: D7005.18243.patch
Type: text/x-patch
Size: 17089 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150115/86c2efe9/attachment.bin>

More information about the llvm-commits mailing list