[LLVMdev] MC disassembler for ARM

Tim Northover t.p.northover at gmail.com
Thu Jun 7 05:11:16 PDT 2012


Hi David,

On Thu, Jun 7, 2012 at 10:17 AM, Fan Dawei <fandawei.s at gmail.com> wrote:
> Could you please tell me more about $a, $t and $d symbols? How these symbols
> are used to define different regions? Where I can find this symbols in ELF
> object file?

At the start of each range of ARM code, an assembler or compiler
should produce a "$a" symbol with that address, and put it (naturally
enough) in the ELF symbol-table. Similarly each stretch of Thumb code
gets a "$t" and each data a "$d".

For example if I assemble:

    .arm
    mov r0, r3
    ldr r2, Lit
Lit:
    .word 42
    add r0, r0, r0
    .thumb
    mov r5, r2

then the symbol table contains these entries:
     4: 00000000     0 NOTYPE  LOCAL  DEFAULT    1 $a
     [...]
     6: 00000008     0 NOTYPE  LOCAL  DEFAULT    1 $d
     7: 0000000c     0 NOTYPE  LOCAL  DEFAULT    1 $a
     8: 00000010     0 NOTYPE  LOCAL  DEFAULT    1 $t

which shows that an ARM region begins at offset 0x0, a data one at
offset 0x8, we switch back to ARM at 0xc and finally Thumb takes over
at 0x10.

GNU objdump hides the symbols by default when printing the
symbol-table (you can give it the --special-syms option to show them),
but readelf shows them always.

If you want the really deep details, they're fully documented in the
ARM ELF ABI here (section 4.6.5):

http://infocenter.arm.com/help/topic/com.arm.doc.ihi0044d/IHI0044D_aaelf.pdf

Which is all nice to know, but I'm afraid it probably doesn't offer an
immediate solution to the undefined instructions:
+ libc.so isn't a relocatable object file (well, it is dynamically,
but that doesn't count).
+ llvm-objdump ignores them anyway at the moment, as far as I can tell.

Tim.



More information about the llvm-dev mailing list