[llvm-dev] Question about target instruction optimization

Bruce Hoult via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 25 14:33:44 PDT 2018


This is so far down the list of problems you'll have (and the difference so
trivial to program size and speed) that I think you should ignore it until
you have a working compiler.

As far as two registers getting the same value, that should be picked up by
common subexpression elimination in the optimiser anyway.

You might want to consider having a pseudo-instruction for LD
{BC,DE,HL,IX,IY},{BC,DE,HL,IX,IY} (all combinations are valid except those
containing two of HL,IX,IY). You could expand this very late in the
assembler, or during legalisation.


On Wed, Jul 25, 2018 at 10:42 AM, Michael Stellmann via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> This is a question about optimizing the code generation in a (new) Z80
> backend:
>
> The CPU has a couple of 8 bit physical registers, e.g. H, L, D and E,
> which are overlaid in 16 bit register pairs named HL and DE.
>
> It has also a native instruction to load a 16 bit immediate value into a
> 16 bit register pair (HL or DE), e.g.:
>
>     LD HL,<imm16>
>
> Now when having a sequence of loading two 16 bit register pairs with the
> *same* immediate value, the simple approach is:
>
>     LD HL,<imm16>
>     LD DE,<imm16>
>
> However, the second line can be shortened (in opcode bytes and cycles) to
> load the overlaid 8 bit registers of HL (H and L) into the overlaid 8 bit
> registers of DE (D and E), so the desired result is:
>
>     ; optimized version: saves 1 byte and 2 cycles
>     LD D,H    (sets the high 8 bits of DE from the high 8 bits of HL)
>     LD E,L    (same for lower 8 bits)
>
>
> Another example: If reg pair DE needs to be loaded with imm16 = 0, and
> another physical(!) register is known to be 0 (from a previous immediate
> load, directly or indirectly) - assuming that L = 0 (H might be something
> else) - the following code:
>
>     LD DE,0x0000
>
> should become:
>
>     LD D,L
>     LD E,L
>
> I would expect that this needs to be done in a peephole optimizer pass, as
> during the lowering process, the physical registers are not yet assigned.
>
> Now my question:
> 1. Is that correct (peephole instead of lowering)? Should the lowering
> always emit the generic, not always optimal "LD DE,<imm16>". Or should the
> lowering process always split the 16 bit immediate load in two 8 bit
> immediate loads (via two new virtual 8 bit registers), which would be
> eliminated later automatically?
> 2. And if peephole is the better choice, which of these is recommended:
> the SSA-based Machine Code Optimizations, or the Late Machine Code
> Optimizations? Both places in the LLVM code generator docs say "To be
> written", so I don't really know which one to choose... or even writing a
> custom pass?
>
> ...and more importantly, how would I check if any physical register
> contains a specific fixed value at a certain point (in which case the
> optimization can be done) - or not.
>
> Michael
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180725/97864604/attachment.html>


More information about the llvm-dev mailing list