[llvm-dev] Finding scratch register after function call

Bruce Hoult via llvm-dev llvm-dev at lists.llvm.org
Sat Jul 21 19:46:16 PDT 2018


Seems like an idea for bigger stack frames would be:

ld ix, 0xNNNN  # 4 bytes, 14 cycles (or iy)
add ix,sp  # 2 bytes, 15 cycles
ld sp,ix #2 bytes, 10 cycles

You could then pop whatever registers you actually saved at the start of
the function (maybe including IX) at one byte and 10 cycles for each 16 bit
register.

Using hl instead of ix/iy would be 3 bytes smaller and 12 cycles faster but
then you'd need to keep any 16 bit result somewhere else first then then
move it

ld hl, 0xNNNN  # 3 bytes, 10 cycles
add hl,sp  # 1 byte, 11 cycles
ld sp,ix #1 byte, 6 cycles
ld h,b #1 bytes, 4 cycles
ld l,c #1 byte, 4 cycles

So in the end you only save 1 byte and 4 cycles.

Annoying that different instructions have different register restrictions:

load 16 bit constant: BC, DE, HL, SO, IX, IY
destination of 16 bit add: HL, IX, IY
source of 16 bit add: BC, DE, SP, same as dest
destination of 16 bit move: only SP!
source of 16 bit move to SP: HL, IX, IY
16 bit push/pop: AF, BC, DE, HL, IX, IY

So both the add and the move to SP restrict you to HL, IX, IY as the
possibilities. BC, DL, AF aren't even options.


Again, you don't "check if a register is free at that point". You *tell*
llvm that the function return needs IX (or whatever) free, and it makes
sure that happens.

On Sat, Jul 21, 2018 at 12:14 PM, Michael Stellmann via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> For a Z80 backend, "eliminateCallFramePseudoInstr()" shall adjust the
> stack pointer in three possible ways, e.g. after a function call, depending
> on the amount (= adjustment size) *and some other rules*:
>
> 1. via one or more target "pop <reg>" instructions (SP increments +2 per
> instruction), using an unused reg (disregarding the contents after the
> operation), followed by an optional +1 increment on the SP for odd amounds
> (SP can be inc/dec'd by 1 directly).
>
> 2. incrementing the SP register directly with a special target operation
> (increments +1 per operation), without using any other register. Requires
> twice as much instructions as "pop" in sequence, though.
>
> 3. via a long sequence of target-specific arithmetic instructions that
> involves a scratch reg (which would have to be saved before and restored
> after the call). This should only be used for larger sizes of call frame
> index.
>
>
> Option 1 ("pop"s) is by far preferred for small call frame sizes. However,
> this requires finding a suitable register. And this is where it gets
> complicated:
> "Suitable" for option 1 means that it shall be any of the 4 physical
> registers AF, HL, DE, BC. When invoked after a function call to do
> caller-cleans-stack, none of the register(s) used for return value of the
> function call shall be used. The calling convention uses
> - one specific pysical 8 bit register (lower 8 bits of AF) for 8 bit
> return values
> - one specific physical 16 bit register (reg HL) for 16 bit
> - two 16 bit regs (regs HL + DE) for 32 bits return value
>
> When determining if one of those registers is available for the option 1
> way of cleaning-up the stack after a function call, reverse-order priority
> is preferred: BC, DE, HL, AF - due to the highly asymmetric command set of
> the Z80, the "least otherwise usable" register should be used for that
> operation, starting with "BC".
>
> Now my questions are:
> A) Is there a way to check if any of those registers are free at that
> point (in eliminateCallFramePseudoInstr()) - i.e. not used as return or
> to hold other values?
> B) If it could be determined that none of the registers are free, option 2
> (adjusting the SP by a series of +1) should be used for small amounts of
> call frame size, option 1 with the forced register "BC" for "pop" for mid
> amounts, and option 3 for larger amounts.
>
> Now if register "BC" is forced to be used to clean the stack up after a
> function call, it should be saved (via "push") on the stack before the
> function is called, or to be specific, even *before* the first function
> parameter for the upcoming function call is pushed to the stack. And
> restored after call frame cleanup (after tha last
> call-frame-elimination-"pop") - by another "pop", restoring the original
> value.
>
> Would "createVirtualRegister" with a register class containing only that
> register do exactly that?
>
> Or is there a better way to do this?
>
> Thanks,
> Michael
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180721/9d9028ae/attachment.html>


More information about the llvm-dev mailing list