[LLVMdev] llvm-objdump

Kevin Enderby enderby at apple.com
Tue Aug 26 13:17:44 PDT 2014


Hi Steve,

For the labeltest.s I get:

% llvm-mc -triple x86_64-apple-darwin10 -filetype=obj -o x86_labeltest.o labeltest.s

First with just -v that produces disassembly (without verbose operands):
% otool -tv x86_labeltest.o 
x86_labeltest.o:
(__TEXT,__text) section
foo:
0000000000000000	nop
bar:
0000000000000001	nop
0000000000000002	jmp	0x7
0000000000000007	jmp	0x1
0000000000000009	jmp	0xe
000000000000000e	nop
baz:
000000000000000f	nop


And second with -V that produces “verbose operands”:

% otool -tV x86_labeltest.o 
x86_labeltest.o:
(__TEXT,__text) section
foo:
0000000000000000	nop
bar:
0000000000000001	nop
0000000000000002	jmp	bar
0000000000000007	jmp	bar
0000000000000009	jmp	baz
000000000000000e	nop
baz:
000000000000000f	nop

And third adding -j that prints the opcode bytes:
% otool -tVj x86_labeltest.o 
x86_labeltest.o:
(__TEXT,__text) section
foo:
0000000000000000	90              	nop
bar:
0000000000000001	90              	nop
0000000000000002	e900000000      	jmp	bar
0000000000000007	ebf8            	jmp	bar
0000000000000009	e900000000      	jmp	baz
000000000000000e	90              	nop
baz:
000000000000000f	90              	nop

For me, operands of -3, -5 and 1 are of little use.  If I think the target is assembled wrong I want to see where it thinks it is going (the hex address in the object file) and the opcode bytes so I can hand decode what is going on (more important in things like arm that don’t have simple displacements).

Also if I’m printing symbolic operands like “bar” I don’t want to see the address of bar or the displacement in that case.  Basically I want to see as close to real assembly code as possible.

Also note for Mach-O, we work hard to not have symbols at the same address and not using symbols that are not assembly temporary names.  We use things like 1f, 2b or L21 because we break sections into “atoms” at the symbol addresses by default (when the assembly has the directive .subsections_via_symbols which produces the flag in the header SUBSECTIONS_VIA_SYMBOLS).

Kev

P.S. We also display raw text bytes with just -t and no -v or -V which is useful when debugging very broken objects:

% otool -t x86_labeltest.o 
x86_labeltest.o:
(__TEXT,__text) section
0000000000000000 90 90 e9 00 00 00 00 eb f8 e9 00 00 00 00 90 90


On Aug 26, 2014, at 12:43 PM, Steve King <steve at metrokings.com> wrote:

> Hi Kev,
> I'm glad to hear llvm-objdump is getting attention.  I'm unclear on
> how much output specialization one could (or should) do for ELF vs.
> Mach-O.  If you're game, let's compare an example:
> 
> $ cat labeltest.s
> .text
> foo:
>    nop
> bar:
> bum:
>    nop
>    jmp   bar
>    jmp   bum
>    jmp   baz
>    nop
> baz:
>    nop
> 
> Assembling for x86 and llvm-objdump'ing, i get
> 
> $ llvm-mc -arch=x86 -filetype=obj labeltest.s -o x86_labeltest.o
> $ llvm-objdump -d  x86_labeltest.o
> 
> x86_labeltest.o: file format ELF32-i386
> 
> Disassembly of section .text:
> foo:
>       0: 90                                           nop
> 
> bum:
>       1: 90                                           nop
>       2: eb fd                                         jmp -3
>       4: eb fb                                         jmp -5
>       6: eb 01                                         jmp 1
>       8: 90                                           nop
> 
> baz:
>       9: 90                                           nop
> 
> I get the dump above with or without -symbolize.
> 
> My personal golden reference, GNU objdump, does this:
> 
> $ objdump -dw x86_labeltest.o
> 
> x86_labeltest.o:     file format elf32-i386
> 
> 
> Disassembly of section .text:
> 
> 00000000 <foo>:
>   0: 90                   nop
> 
> 00000001 <bar>:
>   1: 90                   nop
>   2: eb fd                 jmp    1 <bar>
>   4: eb fb                 jmp    1 <bar>
>   6: eb 01                 jmp    9 <baz>
>   8: 90                   nop
> 
> 00000009 <baz>:
>   9: 90                   nop
> 
> What does otool produce?
> 
> 
> On Tue, Aug 26, 2014 at 11:16 AM, Kevin Enderby <enderby at apple.com> wrote:
>> For branch targets my preference is to print the target’s address (not the displacement of the branch), and preferably in hex.
> 
> I like this too.
> 
>> I don’t think having multiple addresses for a target is a real problem with the exception of the address 0 (which is often an unrelocated no addend value).  So the trick is to not print the symbol name in the object with the address of zero in those cases
> 
> Right, relocations are a special case.

The trick here is with otool(1) and -V we will “guess” at operands symbolic value.  That is even if there is no relocation entry and we have a target address that matches a symbol table value we will use that.  And special case the zero value to try to print zero and not the symbol with the zero address the best we can.  I can dig out the logic I came up with that if you want it.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140826/e79401e9/attachment.html>


More information about the llvm-dev mailing list