[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

Tim Northover t.p.northover at gmail.com
Fri Oct 5 00:15:19 PDT 2012


Hi Greg,

> Is this a bug?  If so, how can I fix it?

It's somewhere between a bug and a quality-of-implementation issue.
ARM often uses literal pools in the middle of code when it needs to
materialize a large constant (or variable address more likely for
R_ARM_ABS32). This results in a sequence roughly like:

    ldr r0, special_lit_sym
    [...]
    b past_literals
special_lit_sym:
    .word variable_desired
past_literals:
    [...instructions...]

In general, deciding whether to disassemble a given location as code
or data is a very hard problem (think of all the evil tricks you could
play with dual-purpose), so the ARM ELF ABI
(http://infocenter.arm.com/help/topic/com.arm.../IHI0044D_aaelf.pdf)
specifies something called mapping symbols, which assemblers should
insert to tell disassemblers what's actually needed.

The idea is that a $a should be inserted at the start of each section
of ARM code, $t before Thumb and $d before data (including these
embedded litpools). In the above example, $a would be somewhere before
the first ldr, $d at "special_lit_sym" and $a again at
"past_literals". objdump will then use these to decide how to display
a given address.

If you dump the symbol table with "readelf -s" (objdump hides them on
my system at least) you should see these in the GCC binary, but almost
certainly not in the LLVM one.

There's some kind of half-written support already in LLVM I believe,
but it's been broken for as long as I can remember. You'd need to make
the MC emitters properly understand when they're switching between
code and data areas, and insert the appropriate symbols.

Hope this helps.

Tim.



More information about the llvm-dev mailing list