[LLVMdev] Looking for ideas on how to make llvm-objdump handle both arm and thumb disassembly from the same object file

Renato Golin renato.golin at linaro.org
Wed Aug 6 12:50:29 PDT 2014


On 6 August 2014 19:31, Kevin Enderby <enderby at apple.com> wrote:
> First a little back ground, the way darwin’s otool(1) does this is that it creates an llvm disassembler for both arm and thumb when disassembling a binary with 32-bit ARM cpu.  It uses the C API in <llvm-c/Disassembler.h> and calls LLVMCreateDisasmCPU() twice, once with an arm TripleName and once with a matching thumb TripleName.  Then for each 32-bit ARM cpu it will default to one or the other disassembler.  Then as it disassembles and finds a symbol in the symbol table for the current PC being disassembled it will see of the symbol has the N_ARM_THUMB_DEF bit set or not.  And then switch disassemblers between the arm and thumb disassemblers.  While this is a bit of a hack there are a limited set of Mach-O cpus otool(1) deals with.

Hi Kevin,

I guess it depends on how many other targets need to deal with the
same problem, and how much their maintainers want to cope with the
change on their side.

Creating multiple disassemblers is wasteful, but not critical to tools
like objdump, that are rarely on the hot path. It would be
simpler/quicker to instantiate them on objdump and then, based on the
Thumb bit, it chooses one or the other. However, I think this would
not be the best solution for some reasons:

1. This is disassembler logic, and having objdump doing this on a
higher level means that other tools that (eventually) need the same
functionality will have to re-implement.
2. It shouldn't be that hard to join the ARM and Thumb disassembler,
given that they're on the same file, share most of the static
functions and could easily delegate with getInstruction() deciding
which to use: getThumbInstruction() or getARMInstruction() and
renaming Thumb/ARMDisassembler functions to match.

Though, I haven't got my hands on the disassembler that much, so other
people with more experience in that area could chime in and give more
reasons on either side.

My personal opinion is that it'd be more elegant and stable that way,
and any work we have to do now would compensate in the future, but
other back-end maintainers could disagree.

cheers,
--renato




More information about the llvm-dev mailing list