[llvm-dev] Finding scratch register after function call
Michael Stellmann via llvm-dev
llvm-dev at lists.llvm.org
Sun Jul 22 00:26:43 PDT 2018
Thanks Bruce,
and elaborately as ever. Again, I'm surprised about your very thorough
Z80 knowledge when you said you only did little on the ZX81 in the
eighties :D
OK, understood. I was first thinking about doing something like this for
small frames:
1. push bc # 1 byte; 11 cycles - part of call frame-cleanup: save
scratch register
+-----begin call-related
2. ld <rr>,stack-param
3. push <rr>
... more code to load non-stack params into registers
4. call ...
5. pop bc # 1 byte ; 10 cycles - call frame cleanup: restore
stack-pointer (value in BC is not used)
+-----end call-related
6. pop bc # part of call frame-cleanup: restore scratch reg's value
The stack cleanup would insert line 5, and have to insert lines 1 and 6
- summing up to 3 bytes of instructions - and maybe the outer two could
be eliminated in a late optimization pass, when register usage is known.
But then again - looking at your math and the *total* mem and cycles,
incl. setup and tear-down - convinced me of dropping my complex idea
with saving "BC" and use it for cleanup or sacrificing the calling
convention. The complexity just doesn't justify the gains. Instead,
going for easy-to-implement solutions:
For small call frames with only 1 or 2 params on the stack, two "inc sp"
(1 byte, 6 cycles per inst) per parameter can be used, and your "big
stack frame" suggestion for larger ones.
This also allows keeping a "beneficial" param and return value calling
convention:
I want to assign the first 3 (HL + DE + BC - or at least 2) function
params to registers, so the stack cleanup is only required for functions
with more than 3 parameters at all - or vararg funcs.
And only functions with more than 5 params will need the "big stack
frame" cleanup. Those cases are rare (or at least can be avoided easily
by a developer), or, knowing the mechanics, shouldn't be used for time
critical inner loops anyway.
Being able to keep HL for the return value allows very efficient nested
function calls in the form "Func1(Func2(nnn));", as register shuffling
can be avoided - the result of Func2 can be passed directly Func1.
Thanks for pointing me again to the right direction!
Michael
Oh, and BTW, I'm planning to do the backend primarily for the MSX - my
first computer in 1984. Just for the fun of it, I started now writing a
small game for it after 25+ years of absence, and was wondering what 30+
years compiler technology would be able to achieve on such a simple (but
challenging, as in "not-alway-straightforward") CPU ;-)
More information about the llvm-dev
mailing list