[LLVMdev] Reimplementing Darwin's dsymutil as an lld helper

Alexey Samsonov vonosmas at gmail.com
Fri Nov 7 11:26:00 PST 2014


On Fri, Nov 7, 2014 at 8:09 AM, Frédéric Riss <friss at apple.com> wrote:

> Hi,
>
> [ I Cc'd lld people and debug info people. Apologies if I omitted some
> stakeholder. ]
>
> As stated in the subject, I’d like to start working on an in-tree
> reimplementation of Darwin’s dsymutil utility. This is an initial step on
> the path to having lld handle the debug information itself.
>
> For those who are not familiar with the debug flow on MacOS, dsymutil is a
> DWARF linker. Darwin’s linker (ld64) doesn’t link the DWARF debug info
> found in the object files, instead it writes a “debug-map” in the linked
> binary. This debug-map describes what objects were linked together and what
> atoms of each object file are present in the binary along with their
> addresses. The debug-map has two uses:
> 1) During the build->debug cycle, lldb reads the debug-map and uses it to
> find the .o files and extract the relevant dwarf debug info.
> 2) For Release builds, dsymutil reads the debug-map then loads, merges,
> and optimizes all the dwarf debug info and writes it as as a .dSYM
>
> The long term goal is that dwarf linking functionality be available as a
> library for LLVM tools. Eventually, we’d like lld to be able to make use of
> the dwarf linking library and not need a stand along dsymutil tool.  The
> first step is to use the dwarf linking library in a stand along dsymutil
> replacement tool. We want this tool to be bit-for-bit compatible with the
> existing Darwin dsymutil.
>
> The main reason we want to take the first step of a separate tool is
> testability. The code committed to the LLVM repository will feature unit
> tests, but they won’t offer the coverage that a real world usage would. I
> plan to run the new tool through big internal validation campaigns during
> which the llvm powered dsymutil output would be compared to the system’s
> dsymutil one. This is also the reason we aim for bit-for-bit compatibility.
>
> The current plan is to host the code in the llvm repository. dsymutil will
> make heavy use of libDebugInfo and won’t share anything with the lld
> codebase (The underlying concepts are just too different). It’s also not
> clear yet where most of the implementation logic will end up. I expect most
> of the core logic to be in tools/dsymutil, but some of it might be better
> folded directly into libDebugInfo.
>
> So how does it work? dsymutil doesn’t simply paste the debug sections
> together while applying relocations to them. This wouldn’t work for ld64 as
> it is able (like lld) to split the sections apart and discard/reorder the
> contents. Thus dsymutil needs some semantic knowledge of the DWARF contents
> to be able to “patch” the relocatable debug info with accurate values. It
> is also able to remove parts of the DIE tree that aren’t needed or to
> unique types across the compilation unit boundaries. In libDebugInfo, we
> have the needed tooling to read the debug info, but we currently lack the
> ability to write it back to disk. Maybe what’s in lib/CodeGen/AsmPrinter to
> emit the debug info would fit the bill, but I won't be sure until I try to
> write the code. I’ll see along the way if libDebugInfo should grow it’s own
> Dwarf streaming capabilities. Opinions welcome.
>
> Although the implementation of the dsymutil command line tool will be
> fairly Darwin specific (it accepts mach-o files as input and emits a dSYM
> bundle), most of the implementation will be format agnostic. I’ll make an
> effort to split the mach-o specific parts into their own files so that this
> code can be reused in a generic way. Would there be interest in that kind
> of code for other platforms also? What’s the story of lld Dwarf support for
> ELF?
>
> I plan on sending the initial code (that does basically only parse the
> debug map of mach-o files) out for review in the coming days if there are
> no objections to the general principle.
>

Sounds reasonable to me. It would be nice to have dsymutil implemented as
an LLVM tool, update is as needed as we change the debug info emitted by
the compiler, ensure that it understands and
behaves well with reduced -gline-tables-only debug info, etc.

It also sounds like you'd have to extend libDebugInfo with DWARF emission
capabilities, that is, reuse part of the code currently stored in
AsmPrinter. Note that currently LLVM backend tools (and Clang) doesn't
depend on libDebugInfo, and it's probably a right thing - they don't need
to read/analyze DWARF or symbolize addresses. I wonder if we'd have to
change the library layout - have some generic library that would describe
DWARF entities (something more powerful than a bunch of enums declared in
Support/Dwarf.h), and make current AsmPrinter and DebugInfo its two
specialized users. In this way, Clang can only contain the former,
llvm-dwarfdump and llvm-symbolizer can only contain the latter, and DWARF
transformation tools (be it linker, or dsymutil) can contain both.


>
> Fred
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>



-- 
Alexey Samsonov
vonosmas at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141107/a2f8def2/attachment.html>


More information about the llvm-dev mailing list