[LLVMdev] R_ARM_ABS32 disassembly with integrated-as

Jim Grosbach grosbach at apple.com
Fri Oct 5 09:48:30 PDT 2012


On Oct 5, 2012, at 12:15 AM, Tim Northover <t.p.northover at gmail.com> wrote:

> Hi Greg,
> 
>> Is this a bug?  If so, how can I fix it?
> 
> It's somewhere between a bug and a quality-of-implementation issue.
> ARM often uses literal pools in the middle of code when it needs to
> materialize a large constant (or variable address more likely for
> R_ARM_ABS32). This results in a sequence roughly like:
> 
>    ldr r0, special_lit_sym
>    [...]
>    b past_literals
> special_lit_sym:
>    .word variable_desired
> past_literals:
>    [...instructions...]
> 
> In general, deciding whether to disassemble a given location as code
> or data is a very hard problem (think of all the evil tricks you could
> play with dual-purpose), so the ARM ELF ABI
> (http://infocenter.arm.com/help/topic/com.arm.../IHI0044D_aaelf.pdf)
> specifies something called mapping symbols, which assemblers should
> insert to tell disassemblers what's actually needed.
> 
> The idea is that a $a should be inserted at the start of each section
> of ARM code, $t before Thumb and $d before data (including these
> embedded litpools). In the above example, $a would be somewhere before
> the first ldr, $d at "special_lit_sym" and $a again at
> "past_literals". objdump will then use these to decide how to display
> a given address.
> 
> If you dump the symbol table with "readelf -s" (objdump hides them on
> my system at least) you should see these in the GCC binary, but almost
> certainly not in the LLVM one.
> 
> There's some kind of half-written support already in LLVM I believe,
> but it's been broken for as long as I can remember. You'd need to make
> the MC emitters properly understand when they're switching between
> code and data areas, and insert the appropriate symbols.

The recent MachO data-in-code support should have fixed a lot of the problems. There's probably still some quirks in the specifics ($a vs. $t and making sure the symbols get into the ELF properly), but the core functionality to know how to mark data regions is there and works very well.

-Jim

> 
> Hope this helps.
> 
> Tim.
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list