[LLVMdev] LLC ARM Backend maintainer

Thu Oct 13 11:42:28 PDT 2011

On Thu, Oct 13, 2011 at 11:25 AM, Joe Abbey <jabbey at arxan.com> wrote:
> LLVM Supports:
> ARMv4T  -> ARM7TDMI
> ARMv5TE -> ARM926EJ-S
>         -> XScale
> ARMv6   -> ARM1136J(F)-S
> ARMv6ZK -> ARM1176JZ(F)-S
> ARMv7A  -> Cortex-A8
>            Cortex-A9
> ARMv7M  -> Cortex-M3

Does the LLVM code generator generate Thumb code in addition to ARM code?

For those who don't know ARM, Thumb is a subset of ARM in which each
instruction is 16-bits in size.

 ARM instructions are 32-bits; besides working poorly on 16-bit Flash,
compilers have a hard time taking advantage of the ARM ISA's unique
advantages, such as every instruction being conditional and most
instructions have an optional shift or rotate of one of the register
operands.  The result is poor code density.

Don't confuse the 16-bit instructions with a 16-bit ISA.  Thumb can
still address 32 bits of memory and has 32 bit registers.  In fact
when executing Thumb code, a single Thumb instruction is placed in an
instruction prefetch buffer, decompressed to a 32-bit ARM instruction
and only then is it executed.

I read somewhere that the instructions present in the Thumb ISA were
chosen to be what most C compilers generate.

There is not just the Thumb that the ARM7TDMI executes, but there is
also a slightly different and hopefully improved Thumb 2 instruction
set that later cores like the Cortex A8 can execute.

I'm pretty sure any CPU that can run Thumb 2 code can also run the
original Thumb code.

One can mix the two ISA through the use of instructions that switch
the modes; this is called "Interworking".  The regular way to return
from an instruction is to put the return address into register 15, but
the instruction BX (if I remember correctly) will both return from a
subroutine and switch the ISA.

I took advantage of this in my AES encryptor, which I wrote in tightly
hand-optimized assembly, by writing the less time-critical outer loops
in slower but more-compact Thumb code, with the more time-critical
inner loops being in the ARM ISA.

In some hopeful future day LLVM will be able to do this as well; if
you have a compilation unit that is mostly Thumb but you inline some
functions or unroll some loops, you should consider interworking to
ARM mode there as well.

I would expect that we would want profiler-guided optimization to drive that.

-- 
Don Quixote de la Mancha
Dulcinea Technologies Corporation
Software of Elegance and Beauty
http://www.dulcineatech.com
quixote at dulcineatech.com