[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

Tim Northover t.p.northover at gmail.com
Fri Oct 5 00:15:19 PDT 2012

Hi Greg,

> Is this a bug?  If so, how can I fix it?

It's somewhere between a bug and a quality-of-implementation issue.
ARM often uses literal pools in the middle of code when it needs to
materialize a large constant (or variable address more likely for
R_ARM_ABS32). This results in a sequence roughly like:

    ldr r0, special_lit_sym
    b past_literals
    .word variable_desired

In general, deciding whether to disassemble a given location as code
or data is a very hard problem (think of all the evil tricks you could
play with dual-purpose), so the ARM ELF ABI
specifies something called mapping symbols, which assemblers should
insert to tell disassemblers what's actually needed.

The idea is that a $a should be inserted at the start of each section
of ARM code, $t before Thumb and $d before data (including these
embedded litpools). In the above example, $a would be somewhere before
the first ldr, $d at "special_lit_sym" and $a again at
"past_literals". objdump will then use these to decide how to display
a given address.

If you dump the symbol table with "readelf -s" (objdump hides them on
my system at least) you should see these in the GCC binary, but almost
certainly not in the LLVM one.

There's some kind of half-written support already in LLVM I believe,
but it's been broken for as long as I can remember. You'd need to make
the MC emitters properly understand when they're switching between
code and data areas, and insert the appropriate symbols.

Hope this helps.


