[LLVMdev] Looking for ideas on how to make llvm-objdump handle both arm and thumb disassembly from the same object file
Kevin Enderby
enderby at apple.com
Wed Aug 6 13:18:08 PDT 2014
Hi Renato,
Thanks for your reply. A few comments in line below.
Kev
On Aug 6, 2014, at 12:50 PM, Renato Golin <renato.golin at linaro.org> wrote:
> On 6 August 2014 19:31, Kevin Enderby <enderby at apple.com> wrote:
>> First a little back ground, the way darwin’s otool(1) does this is that it creates an llvm disassembler for both arm and thumb when disassembling a binary with 32-bit ARM cpu. It uses the C API in <llvm-c/Disassembler.h> and calls LLVMCreateDisasmCPU() twice, once with an arm TripleName and once with a matching thumb TripleName. Then for each 32-bit ARM cpu it will default to one or the other disassembler. Then as it disassembles and finds a symbol in the symbol table for the current PC being disassembled it will see of the symbol has the N_ARM_THUMB_DEF bit set or not. And then switch disassemblers between the arm and thumb disassemblers. While this is a bit of a hack there are a limited set of Mach-O cpus otool(1) deals with.
>
> Hi Kevin,
>
> I guess it depends on how many other targets need to deal with the
> same problem, and how much their maintainers want to cope with the
> change on their side.
That’s the rub. I think only 32-bit arm has this issue with multiple disassemblers
and I would hate to add a bunch of stuff that all targets would have to deal with.
Love to hear if any other target maintainer could even uses this.
>
> Creating multiple disassemblers is wasteful, but not critical to tools
> like objdump, that are rarely on the hot path. It would be
> simpler/quicker to instantiate them on objdump and then, based on the
> Thumb bit, it chooses one or the other.
I agree with all that. And this would be pretty simple to deal with inside
objdump and be done with it.
> However, I think this would
> not be the best solution for some reasons:
>
> 1. This is disassembler logic, and having objdump doing this on a
> higher level means that other tools that (eventually) need the same
> functionality will have to re-implement.
Also agreed. But can you or anyone else think of other tools that would
need this logic? Hate to do a whole bunch of work adding to the lower
layers just to make objdump a bit cleaner.
> 2. It shouldn't be that hard to join the ARM and Thumb disassembler,
> given that they're on the same file, share most of the static
> functions and could easily delegate with getInstruction() deciding
> which to use: getThumbInstruction() or getARMInstruction() and
> renaming Thumb/ARMDisassembler functions to match.
Yep could do that. But it seems like a lot of work for very little pay off.
> Though, I haven't got my hands on the disassembler that much, so other
> people with more experience in that area could chime in and give more
> reasons on either side.
Me too.
> My personal opinion is that it'd be more elegant and stable that way,
Absolutely agree about this being more elegant.
> and any work we have to do now would compensate in the future, but
> other back-end maintainers could disagree.
As a maintainer of darwin’s otool(1) for some 20 plus years and plugging in
some 9 or more different disassemblers the use if two llvm disassembler’s for
32-bit arm is no big deal at all. Heck it still has the old arm disassemblers in
it and many other old ones and it is very stable and I rarely if ever have to touch
those interfaces.
>
> cheers,
> --renato
More information about the llvm-dev
mailing list