[llvm-dev] Distinguish between ARM and Thumb
Peter Smith via llvm-dev
llvm-dev at lists.llvm.org
Thu Jun 28 06:07:32 PDT 2018
If you are disassembling a non-stripped ELF binary you can find out
the Arm/Thumb state by looking at the mapping symbols $t and $a,
alternatively each ELF symbol of type STT_FUNC will have bit 0 set to
0 for Arm state and bit 1 for Thumb state. Hence with the symbol table
you can reconstruct the state at each address by finding a symbol.
More information is available in ELF for the Arm Architecture .
If you have got a stripped binary without any symbolic information
then life gets a lot more difficult. There are some encoding rules 
that can help you find out whether a Thumb instruction is 2 or 4 bytes
long but in general you'll at least need to know whether you are
starting on an Arm or Thumb instruction and will need to trace control
flow instructions to track state changes and to avoid interpreting
literal data as instructions.
For the former I don't think you need to do much beyond reading the
symbol table. I don't think LLVM does passes to reconstruct binaries,
that logic would usually lie in a tool like objdump.
Hope this helps
(search for mapping symbols)
(search for Thumb instruction encoding)
On 28 June 2018 at 13:32, Muhui Jiang via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
> Nowadays I am using LLVM to do ARM binary analysis. I was wondering is llvm
> available to provide some debugging information on the mode of ARM.
> For example, llvm-dwarfdump could dump some instructions information for
> debugging. Is it able to know the mode for each instruction? Or we may
> write some llvm pass to help us to know the instruction mode? Any
> suggestions are welcomed. Many Thanks
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
More information about the llvm-dev