[LLVMdev] llvm-objdump

Tue Aug 26 11:16:37 PDT 2014

Hi Steve,

I too have been working on improving llvm-objdump for Mach-O files, which I guess I would be called an expert in.  My long term goal is to match llvm-objdump’s functionality with that of darwin’s otool(1) and improve beyond that.

For branch targets my preference is to print the target’s address (not the displacement of the branch), and preferably in hex.  With a way to toggle between non-symbolic and symbolic.  As non-symbolic is needed for debugging.  And symbolic should be the full on use the symbol table, relocation entries, past instructions, indirect tables, literal tables, Objective-C meta data, C++ demanglers, and even debug info etc, to print the best operand and comment along with the instruction.  For symbolic we go to all these lengths (short of debug info) in darwin’s otool(1) using llvm’s dissembler hooks.

I do think the default makes sense to be symbolic by default and non-symbolic with an option.  I would love to extend the non-symbolic option to things like printing the private headers, relocation entries, etc as raw value.  Again this is very useful for debugging and dealing with broken object files when you need to see the values and what could be going on.  The name -bare as an option seems fine for this to me.

I don’t think having multiple addresses for a target is a real problem with the exception of the address 0 (which is often an unrelocated no addend value).  So the trick is to not print the symbol name in the object with the address of zero in those cases.  Generally in Mach-O we don’t see multiple symbols at the same address.

In Mach-O we don’t have typed symbols in the symbol table without looking at debugging info.  But what you say about using type FUNC symbols for ELF seems to make sense to me.

My thoughts,
Kev

On Aug 26, 2014, at 9:52 AM, Steve King <steve at metrokings.com> wrote:

> I would like to improve llvm-objdump.  However, many unit tests depend
> precisely on the current output, making the picture a little tricky.
> My experience is limited to ELF format objects, so experts in other
> formats please sanity check.
> 
> Suggested changes:
> 1) Symbolize conditional branch targets.  Currently, llvm-objdump
> prints branch targets numerically regardless of -symbolize.
> 
> 2) Make -symbolize the default behavior for human friendliness.
> 
> 3) Add new -bare option to suppress symbolizing.  Many unit tests will
> use -bare to preserve expected output in today's format.
> 
> 4) When multiple symbols exist for a given address, print all of them.
> Today, llvm-objdump only prints the last symbol found, but symbolizes
> references with the first symbol found.  So, it's a bit of a mess.
> 
> 5) When symbolizing code references, prefer matching symbols with type
> FUNC, but fall back to matches with type NOTYPE.  This matches GNU
> objdump behavior and many hand written assembly files don't specify
> .type directives anyway.
> 
> How does this sound?
> 
> Regards,
> -steve
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev