[llvm-dev] Question about target instruction optimization
Michael Stellmann via llvm-dev
llvm-dev at lists.llvm.org
Thu Jul 26 00:23:52 PDT 2018
Yes, "crippled" is the right word to describe some areas of the
instruction set.
You are right with the register comparison with X86's regs.
Indeed, the idea is to save IX in a function's prologue and assign the
stack pointer to it in a function's prologue, and in the epilogue,
restore the SP from IX and IX's original value *if* the function needs a
frame (i.e. has parameters or variables on the stack).
Plus, even adjusting the stack pointer to outside the local storage area
requires sacrificing the HL register pair for larger amounts (or a
series of "inc sp" or dummy-pushes for shorter ones). So the emitted
code for the frame-setup will be quite dynamic, depending on the
circumstances, but can be done with ~3-4 cases.
Stack access to locals / spilled vars within the boundaries of the
offset range (-128 - +127) will be done via IX, and any access to
outside the range will require address calculation from the current
stack base via HL (loading the offset into HL and add SP).
And even then, saving / spilling a physical register (16 bit pair) into
a stack position within the range of IX is a costly sequence (LD
(IX+n),low8 / LD (IX+(n+1)),high8 to store a value, and the reverse to
restore it) - 8 bytes in total, and IX/IY-instructions are always slower
than operations with other regs - so "push" and "pop" for temporary
short-term (single) register spilling would be preferred - those need
only 1 byte and save / restore the whole 16 bit value in one go.
Not sure how to tell LLVM to do so, though.
However, for functions small enough to do any computation in the
available registers, or where spilling can be limited to some push and
pop operations, the whole call frame setup can be skipped at all. If a
few params can be passed through registers, the resulting code can be as
efficient as hand-written assembly (or even more, regarding the
capabilities of the SSA).
Knowing that, a developer will have control over it to at least *some*
degree. Making a local var "static" would allow the compiler to use the
efficient instruction to store and restore a 16 bit variable directly at
a memory address (for the cost of losing recursion and the that the
optimizer could keep it in a register).
But then, using existing C compilers for the Z80 (from *way* back then)
was always a game of compiling, checking the emitted code, rearrange,
build, check again if the function is time-critical, or writing it in
assembly. In most cases, the just just needs to be "good enough", and
good compilers achieved 50-90% of the performance of hand-written assembly.
And of course, I expect LLVM to beat that >;->
I started off with jacobly0's (E)Z80 backend heritage, made it build
with a recent version of LLVM again and try to understand the
shortcomings and areas of improvement.
It is targeted at EZ80's more powerful instruction set, introduced a lot
of custom code into the LLVM base to support ZE80's 24 bit native
pointers and custom binary file output (which caused it to no longer
work with the current LLVM codebase), and last but not least, is incomplete
I created a project area in Github
(https://github.com/MI-CHI/llvm-z80-backend) where I'm started working
with a friend from the MSX community on that, with - apart from making a
decent Z80 backend - some long term goals such as banking support for a
"far" memory model to be able to compile "bigger" applications such as a
ZIP archiver. Might end-up horribly slow, though :D
As of now, we are in the phase to get a feeling on how to do things in
the LLVM backend and to define the rules for the instruction lowering,
calling conventions and frame setup (shamelessly studying the code that
old Z80 C compilers emit ;-)
Michael
More information about the llvm-dev
mailing list