[llvm-dev] Reimplementing Darwin's dsymutil as an lld helper

Tue Nov 17 16:34:29 PST 2015

On Tue, Nov 17, 2015 at 4:24 PM, Frédéric Riss <friss at apple.com> wrote:

>
> On Nov 17, 2015, at 4:10 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> (oops, switch mailing list)
>
> On Tue, Nov 17, 2015 at 4:07 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>> Wee, delayed response, but nothing drastic:
>>
>> I just noticed the tool is "llvm-dsymutil" but it's in tools/dsymutil,
>> unlike all the other tools that have the llvm-prefix in the directory name.
>> Could we move it to "tools/llvm-dsymutil" for consistency?
>>
>
> The idea was to have the tool renamed to dsymutil when it’s ready to
> replace the system one, so that it gets picked up when it’s in your PATH.
>

Fair enough - I'm certainly not making any sort of hardline/firm argument,
just nice to have consistency (& directories can be renamed if the tool
changes name in the future, etc) & one fewer custom mappings to remember.

Though I imagine we might end up aliasing dsymutil to llvm-dsymutil (as gcc
is aliased to clang (symlink, copy, whatever mechanism is used) on OSX,
etc), but I could see it going one of many ways.

I think for now I'll make the LLVM dwp tool llvm-dwp and worry about how
peole want to use it/refer to it later.

>
> Fred
>
> On Fri, Nov 7, 2014 at 8:09 AM, Frédéric Riss <friss at apple.com> wrote:
>>
>>> Hi,
>>>
>>> [ I Cc'd lld people and debug info people. Apologies if I omitted some
>>> stakeholder. ]
>>>
>>> As stated in the subject, I’d like to start working on an in-tree
>>> reimplementation of Darwin’s dsymutil utility. This is an initial step on
>>> the path to having lld handle the debug information itself.
>>>
>>> For those who are not familiar with the debug flow on MacOS, dsymutil is
>>> a DWARF linker. Darwin’s linker (ld64) doesn’t link the DWARF debug info
>>> found in the object files, instead it writes a “debug-map” in the linked
>>> binary. This debug-map describes what objects were linked together and what
>>> atoms of each object file are present in the binary along with their
>>> addresses. The debug-map has two uses:
>>> 1) During the build->debug cycle, lldb reads the debug-map and uses it
>>> to find the .o files and extract the relevant dwarf debug info.
>>> 2) For Release builds, dsymutil reads the debug-map then loads, merges,
>>> and optimizes all the dwarf debug info and writes it as as a .dSYM
>>>
>>> The long term goal is that dwarf linking functionality be available as a
>>> library for LLVM tools. Eventually, we’d like lld to be able to make use of
>>> the dwarf linking library and not need a stand along dsymutil tool.  The
>>> first step is to use the dwarf linking library in a stand along dsymutil
>>> replacement tool. We want this tool to be bit-for-bit compatible with the
>>> existing Darwin dsymutil.
>>>
>>> The main reason we want to take the first step of a separate tool is
>>> testability. The code committed to the LLVM repository will feature unit
>>> tests, but they won’t offer the coverage that a real world usage would. I
>>> plan to run the new tool through big internal validation campaigns during
>>> which the llvm powered dsymutil output would be compared to the system’s
>>> dsymutil one. This is also the reason we aim for bit-for-bit compatibility.
>>>
>>> The current plan is to host the code in the llvm repository. dsymutil
>>> will make heavy use of libDebugInfo and won’t share anything with the lld
>>> codebase (The underlying concepts are just too different). It’s also not
>>> clear yet where most of the implementation logic will end up. I expect most
>>> of the core logic to be in tools/dsymutil, but some of it might be better
>>> folded directly into libDebugInfo.
>>>
>>> So how does it work? dsymutil doesn’t simply paste the debug sections
>>> together while applying relocations to them. This wouldn’t work for ld64 as
>>> it is able (like lld) to split the sections apart and discard/reorder the
>>> contents. Thus dsymutil needs some semantic knowledge of the DWARF contents
>>> to be able to “patch” the relocatable debug info with accurate values. It
>>> is also able to remove parts of the DIE tree that aren’t needed or to
>>> unique types across the compilation unit boundaries. In libDebugInfo, we
>>> have the needed tooling to read the debug info, but we currently lack the
>>> ability to write it back to disk. Maybe what’s in lib/CodeGen/AsmPrinter to
>>> emit the debug info would fit the bill, but I won't be sure until I try to
>>> write the code. I’ll see along the way if libDebugInfo should grow it’s own
>>> Dwarf streaming capabilities. Opinions welcome.
>>>
>>> Although the implementation of the dsymutil command line tool will be
>>> fairly Darwin specific (it accepts mach-o files as input and emits a dSYM
>>> bundle), most of the implementation will be format agnostic. I’ll make an
>>> effort to split the mach-o specific parts into their own files so that this
>>> code can be reused in a generic way. Would there be interest in that kind
>>> of code for other platforms also? What’s the story of lld Dwarf support for
>>> ELF?
>>>
>>> I plan on sending the initial code (that does basically only parse the
>>> debug map of mach-o files) out for review in the coming days if there are
>>> no objections to the general principle.
>>>
>>> Fred
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151117/3102e089/attachment.html>