[PATCH] Add tail call optimization for thumb1-only targets rev. 3
bjoern.m.haase at web.de
Sat Jan 17 01:37:31 PST 2015
I've looked at the code of the register scavenger for thumb1.
There definitely *is* an issue. The register scavenger takes R12 without any further checks of usage.
We will need to implement some change in the scavenger to be on the safe side. Using the stack for scavenging will prove difficult, when considering possible alloca() uses.
Scavenging will never be necessary within the epilogue code, if the ldr rX, mov R12, rx sequence shows up just before the epilogue, we will not be having a problem. Therefore, One might look for a way for forcing the mov R12,rx to be the very last instruction before epilogue generation.
As minimum, I'd add checks for usage of R12 in the target scavenger in order to run into an assert instead of silently generating bad code.
The other option, I am seeing is to make the register scavenger use LR in case that LR is in the CSI list. LR will be pushed and poped as soon as any GPR is spilled and restored. In case of load address loading to R12 for tail calls, LR is always pushed. In this case, we may readily use LR instead of R12 for scavenging. The only issue that I may imagine for this approach is a possible use of __builtin_return_address that might try to get the address from LR and interfere with scavenging.
More information about the llvm-commits