[LLVMdev] MC disassembler for ARM

Fan Dawei fandawei.s at gmail.com
Thu Jun 7 02:17:26 PDT 2012


Hi Tim,

Thanks a lot for the reply.

I tested libc.so which is a shared library. llvm-objdump also report some
disassemble errors.

Could you please tell me more about $a, $t and $d symbols? How these
symbols are used to define different regions? Where I can find this symbols
in ELF object file?

Thanks,
David





I'm now try to find a decoder of ARM instructions in oder

On Thu, Jun 7, 2012 at 3:57 AM, Tim Northover <t.p.northover at gmail.com>wrote:

> Hi David,
>
> > I've try to use llvm-objdump to disassemble some ARM binary, such as
> busybox
> > in android.
> >
> > ./llvm-objdump -arch=arm -d busybox
>
> It's probably assuming the wrong architecture revision. I don't have
> an android busybox handy, but I see similar on binaries compiled for
> ARMv7. The trick is to use:
>
> llvm-objdump -triple=armv7 -d whatever
>
> (ARMv7 covers virtually anything Android will be running on these days).
>
> There are a couple of other things to be wary of at the moment though:
> 1. PC-relative data, as you said: ARM code often includes literal data
> inline with code, this could well *not* have a valid disassembly. In
> relocatable object files, these regions should be marked[*], but I
> believe LLVM has problems with that currently. In executable files
> (like "busybox") the regions won't necessarily even be marked.
>
> 2. ARM object files may contain mixed ARM and Thumb code: two
> different instruction sets. Obviously, disassembling ARM as Thumb or
> the reverse won't give you anything sensible. Again, relocatable files
> mark these regions[*] but executables don't. If you know an what you
> want is thumb code, you can use the triple "thumbv7" instead for
> llvm-objdump.
>
> So a combination of those probably explains why you're getting
> problems and may improve matters, but it probably won't make things
> perfect (and arguably can't in the case of the ARM/Thumb distinction
> without reconstructing all possible control-flow graphs).
>
> Tim.
>
> [*] The marking is via symbols $a, $t and $d which reference the
> beginning each stretch of ARM code, Thumb code and Data.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120607/a1fdd393/attachment.html>


More information about the llvm-dev mailing list