[lld] r198728 - [mach-o] properly extract atom content from subrange of section content

Thu Aug 28 12:01:36 PDT 2014

Jalin,

The compiler often aligns functions to start at aligned address (e.g. 8 or 16 bytes).  So, there is padding bytes (usually NOPs) after the end of a function and before the start of the next function.

dwarfdump has access to the debug info which does have size info for functions.  So dwarfdump can show the size of a function without the alignment padding at the end.  Whereas the linker just looks at the next symbol address.  There is not much point in the linker digging through the debug info to get a function’s true size, because when writing the output, the linker has to align the next function which would just add back the pad bytes.

-Nick

On Aug 27, 2014, at 10:06 PM, jalin.cwk at foxmail.com wrote:
> Hello Nick,
>     I'm Jalen, a student from SCUT, China. It's very fortunately for me to have the opportunity to contact you. 
>     Before watch your the mail list(http://webcache.googleusercontent.com/search?q=cache:D_s11w9oq3IJ:unix.superglobalmegacorp.com/xnu/newsrc/osfmk/mach-o/loader.h.html+&cd=26&hl=zh-CN&ct=clnk&gl=cn), I'm confused with the lack of "size" of symbol table in the Mach-o file. And I found the solution in the "lld/trunk/lib/ReaderWriter/MachO/MachONormalizedFileToAtoms.cpp" you posted in that E-Mail, which note that"Mach-O symbol table does have size in it, so need to scan ahead to find symbol with next highest address." 
>     But when I parse out the symbol table in a Mach-O (.dSYM) file (I got the symbol table from the symtab_command and the following nlists) and trying to calculate the size of one global symbol as the same way, I was confused again when I compared the symbol table from the output of dwarfdump (dwarfdump -ae). The end address of the symbol in the symbol table from the dwarfdump is different from the result my program's output. Is there some problem with the symbol table I parsed out? Or is there some other way to work out it? 
> 
> Some of the output from my program:
> <start address> <section index>    <method>
> 0x0006d030        1                            ___arclite_objc_autoreleasePoolPop 
> 0x0006d048        1                            _patch_lazy_pointers 
> 0x0006d1f0         1                            ___arclite_objc_autoreleasePoolPush
> 
> The corresponding part of the output from dwarfdmp:
> 0x0014a37b: [0x0006d030 - 0x0006d046) __arclite_objc_autoreleasePoolPop 
> 0x0014a122: [0x0006d048 - 0x0006d1ee) patch_lazy_pointers 
> 0x0014a3a0: [0x0006d1f0 - 0x0006d212) __arclite_objc_autoreleasePoolPush
> 
> So if I use the way in the "MachONormalizedFileToAtoms.cpp" to calculate the end address of the symbol (look ahead to find symbol with next highest address), the result must be different from the output of dwarfdump. And do you know how dwarfdump calculate it?
> 
> The Mach-O file (dSYM) I tested above is in the appendix.
> 
> Thank you for reading and I'm sincerely looking forward to hear from you soon.
>  
> 
> jalin.cwk at foxmail.com
> <RqdApp>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140828/067ff801/attachment.html>