[LLVMdev] MC disassembler for ARM

Tim Northover t.p.northover at gmail.com
Wed Jun 6 12:57:46 PDT 2012


Hi David,

> I've try to use llvm-objdump to disassemble some ARM binary, such as busybox
> in android.
>
> ./llvm-objdump -arch=arm -d busybox

It's probably assuming the wrong architecture revision. I don't have
an android busybox handy, but I see similar on binaries compiled for
ARMv7. The trick is to use:

llvm-objdump -triple=armv7 -d whatever

(ARMv7 covers virtually anything Android will be running on these days).

There are a couple of other things to be wary of at the moment though:
1. PC-relative data, as you said: ARM code often includes literal data
inline with code, this could well *not* have a valid disassembly. In
relocatable object files, these regions should be marked[*], but I
believe LLVM has problems with that currently. In executable files
(like "busybox") the regions won't necessarily even be marked.

2. ARM object files may contain mixed ARM and Thumb code: two
different instruction sets. Obviously, disassembling ARM as Thumb or
the reverse won't give you anything sensible. Again, relocatable files
mark these regions[*] but executables don't. If you know an what you
want is thumb code, you can use the triple "thumbv7" instead for
llvm-objdump.

So a combination of those probably explains why you're getting
problems and may improve matters, but it probably won't make things
perfect (and arguably can't in the case of the ARM/Thumb distinction
without reconstructing all possible control-flow graphs).

Tim.

[*] The marking is via symbols $a, $t and $d which reference the
beginning each stretch of ARM code, Thumb code and Data.



More information about the llvm-dev mailing list