[llvm-dev] DWARF Fission + ThinLTO

Thu May 4 11:36:22 PDT 2017

On Thu, May 4, 2017 at 11:23 AM Robinson, Paul <paul.robinson at sony.com>
wrote:

> Sorry, trying to catch up a bit late…
>
>
>
> It sounds like having more than one CU per .dwo is outside of the
> intention of the DWARF specification (though not explicitly forbidden),
> since there is an implied 1-1 relationship between skeleton CU and .dwo.
>
>
>
> There is an explicit 1-1 relationship between skeleton CU and split-full
> CU (not .dwo).  This suggests to me that if you want a .dwo to have
> multiple CUs, then the .o should have multiple corresponding skeleton CUs.
>

Right - that's closest to/as things are today, for sure.

>   Each skeleton/split-full pair would have the same dwo_name but different
> dwo_IDs.  You're making it sound like LTO is actually producing only one
> skeleton CU?  That would be a bug.
>

It isn't at the moment - at the moment it produces a skeleton per CU, a
single DWO file with multiple CUs, and all this (with a relatively small
patch to GDB) works in GDB today so long as there aren't any cross-cu
references and so long as it's DWO files and not a DWP file.

>  I am not finding anywhere that explicitly says a .dwo file can have only
> one unit?  Although it does seem that the definition of the cu_index
> assumes each unit should be treated as being in an independent .debug_info
> section, which is a problem for making cross-CU references in a given .dwo
> file.  Thanks for bringing that up on dwarf-discuss already.
>

Right - I'm not sure if binutils DWP was implemented with this in mind, or
if it fell out naturally from the debug_types support (walk the info/types
section for each unit, keep it if it's not already present in the output -
so it would naturally want to trim the info/types contribution to that
specific unit since other units may not be included in the output). My
llvm-dwp failed this - and brings in the first CU, but I think it takes the
whole unit CU contribution, maybe... - so that's a bit broken in any case
here.

>
>
> So this leads us to a few options - the 'simplest', that changes the
> semantics but not the syntax of the table - would be to widen the INFO
> range to cover the whole portion that came from the original DWO file.
>
> Consumers would have to be adjusted to cope with the fact that the INFO
> contribution for a given DWO ID would be a range that contains the CU
> somewhere, along with other CUs - and they'd have to search (in full LTO,
> this would mean searching through /all/ the CUs).
>
>
>
> But each skeleton/split-full pair needs a distinct DWO ID (per the v5
> spec).
>

Each skeleton/split-full pair would still have a distinct DWO ID (each CU
still has a separate hash) from what I was describing above/in the section
you quoted. But the debug_info contribution in the cu_index would be
'vague' - it would be widened to describe the whole info contribution from
that DWO file.

>   I guess we could address this by replicating the index entry for each
> DWO ID in the .dwo file, so lookup-by-ID would still work, but the INFO
> ranges would always describe the full .debug_info section from the .dwo
> file.
>

Right, that ^ so, yeah "replicated" in the sense that for each DWO ID in
the same DWO file, they'd have a separate entry in the cu_index with each
unique DWO ID - but the contributions would be the same for every one of
those DWO IDs from the same DWO file.

> You'd still need to search the .dwo file's part of the INFO range to find
> the actual CU, unless we change the syntax of the table to provide those
> offsets.  But then ref_addr would work across CUs from the same .dwo file.
>

Precisely.

In the worst case, with full LTO - that'd be potentially a lot of searching
across because it'd be a single DWO file with all the CUs.

hopefully we can talk about this more on the dwarf-discuss list (I haven't
seen any replies there, maybe I should just subscribe to the dwarf
committee list & discuss it there? I dunno - open to suggestions)

- Dave

> --paulr
>
>
>
>
>
> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *David
> Blaikie via llvm-dev
> *Sent:* Wednesday, May 03, 2017 7:52 PM
> *To:* Adrian Prantl
> *Cc:* llvm-dev
> *Subject:* Re: [llvm-dev] DWARF Fission + ThinLTO
>
>
>
>
>
> On Wed, May 3, 2017 at 7:48 PM Adrian Prantl <aprantl at apple.com> wrote:
>
> On May 3, 2017, at 7:43 PM, Adrian Prantl via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>
>
>
> On May 3, 2017, at 2:59 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
>
>
> On Wed, May 3, 2017 at 2:09 PM Adrian Prantl <aprantl at apple.com> wrote:
>
>
> > On May 3, 2017, at 2:00 PM, David Blaikie <dblaikie at gmail.com> wrote:
> >
> > So Dehao and I have been dealing with some of the nitty gritty details
> of debug info with ThinLTO, specifically with Fission(Split DWARF).
> >
> > This applies to LTO as well, so I won't single out ThinLTO here.
> >
> > 1) Multiple CUs in a .dwo file
> > Clang/LLVM produces a CU for each original source file - these CUs are
> kept through IR linking (thin or full) and produced as distinct CUs in the
> resulting DWARF.
> > This preserves semantics as much as possible - eg: file-local functions
> and types (those in anonymous namespaces or declared file-static) con
> co-exist even if their names collide.
> > GDB does good things with this, except with Fission.
>
> Can you lay out the terminology you are using here? Is Fission as used
> here different/more than creating .dwo files?
>
>
> Fission as used in that sentence means .dwo files and anything beyond
> (.dwp files - https://gcc.gnu.org/wiki/DebugFissionDWP ). GDB
> warns/errors/complains if two CUs are in a single .dwo, and ignores all but
> the first.
>
>
>
> It sounds like having more than one CU per .dwo is outside of the
> intention of the DWARF specification (though not explicitly forbidden),
> since there is an implied 1-1 relationship between skeleton CU and .dwo.
>
>
>
> What do you think is the *right* solution here:
>
> - Allowing more than one CU in gdb?
>
> - Emitting many little .dwo files for the cross-inlined functions' CUs?
>
>
>
> Hmm.. I guess that would make the cross-CU references impossible. (Unless
> we can refer to everything via signatures).
>
>
> Precisely!
>
> It'd be pretty tricky to do by signature - /maybe/ possible. But these are
> internal symbols - so they're not very uniqueable - in terms of mangled
> name, etc to hash.
>
> and I hadn't mentioned one other wrinkle:
>
> The CU fragments that result from selectively importing functions in
> ThinLTO are going to be a problem - if two ThinLTO shards both import the
> same chunks from a third module - they'll potentially produce two CUs with
> the same DWO ID hash. So I was thinking there might be some need to
> cross-polinate the hashes of the various CUs in ThinLTO, to create 'more
> unique' identifiers... but I don't know exactly how it'd all look there.
>
>
>
>
>
>
>
> > Binutils DWP produces usable DWPs from DWO files with multiple CUs
> >
> > 2) Cross-CU references
> > This is where it gets trickier.
> > LLVM produces cross-CU references (DW_FORM_ref_addr) to refer to types
> or functions defined in one CU used from another. This only happens with
> (thin or plain) LTO. LLVM's had this for a while.
> > This helps fully express interesting cases outlined in (1) - it's
> possible that a file-local function is inlined into another CU - by using
> ref_addr the semantics described in (1) can be preserved, while also
> explaining the inlining that has occurred. (similar cases can arise with
> intra-CU type references too)
> > GDB handles this, except with Fission (I mean, it doesn't get this far -
> but even with patches to handle (1)+Fission, it's still not enough - needs
> more work)
> > Binutils DWP and the DWP format in general... maybe can't cope with this.
> >
> > Talking about (2)+DWP:
>
> You mean (2)+DWO+DWP?
>
>
> DWP is a package containing the contents of multiple DWOs - so, yes and
> no? Not sure how to answer.
>
> (2)+DWO could be made to work without changing much in the
> contents/format/etc, I think - consumers could be made to understand the
> 'obvious' form (no change to producers I think would be needed).
>
> But once you go to a DWP file, then there are repreesntational problems
> that make it currently impossible to use cross-CU references. The semantics
> (& possibly the syntax) of DWP files would have to change to enable this
> functionality.
>
>
>
> Why can a .dwo cope with cross-CU-references but not a .dwp?
>
>
>
>
> > It looks like this may require adjusting at least the semantics, if not
> the syntax of the indexes in the DWP file. I've started a thread on
> dwarf-discuss to start trying to hash that out.
> >
> >
> >
> > So, to cut a long story short: Fission+LTO is currently unusable and may
> take a while to make the DWARF side of this usable.
> >
> > In the interim, I'd like to propose adding a flag to LLVM to support
> something not very nice: When merging modules (or importing things into a
> module in ThinLTO), do not maintain separate CUs but redirect the cu
> pointer in any DISubprogram (& probably DIGlobalVariable too - I forget if
> ThinLTO can import them) metadata to refer to the original CU in the
> destination module. (in the full LTO case, it'd also be necessary to import
> any retained types)
> >
> > This flag could be passed to the task doing the merging/importing, or
> could be passed to the frontend compile and stashed in per-CU metadata
> (where it would be respected when importing/merging from that CU).
> >
> > The intent would be to support this flag until the DWARF functionality
> can be decided on and implemented - and kept for some time after that for
> backwards compatibility for those who want to use Fission but haven't got
> the latest toolchains with the fixes/improvements in them.
> >
>
>
>
> If the other tools cannot be fixed quickly, then this would be a pragmatic
> interim solution.
>
>
>
> >
> >
> > How's this sound to everyone? Reasonable? Unreasonable? Need more
> details about the impact of these changes, etc? (I have examples with DWARF
> output, GDB behavior, etc)
>
> If you can post an example, that would help.
>
>
> OK, so, here's an example of how GDB, without fission, benefits from the
> two distinct CUs (rather than bundling all the types & functions into a
> single CU):
>
> given:
>   a.cpp:
>
>     namespace {
>
>     struct foo {
>
>       int i;
>
>     };
>
>     }
>
>     static void f1() {
>
>     }
>
>     void f2() {
>
>       f1();
>
>     }
>   b.cpp:
>
>     namespace {
>
>     struct foo {
>
>       float f;
>
>     };
>
>     }
>
>     static void f1() {
>
>     }
>
>     void f2();
>
>     int main() {
>
>       f1();
>
>       f2();
>
>     }
>   $ clang++ {a,b}.cpp -g -c -emit-llvm -S
>   $ llvm-link {a,b}.ll -S -o ab.ll
>
>   $ clang++-tot ab.ll
>   $ gdb a.out
>   (gdb) start
>
>   ..., main () at b.cpp:11
>
>   11        f1();
>
>   (gdb) ptype foo
>
>   type = struct (anonymous namespace)::foo {
>
>       float f;
>
>   }
>
>   (gdb) p &f1
>
>   $1 = (void (*)(void)) 0x400530 <f1()>
>
>   (gdb) n
>
>   12        f2();
>
>   (gdb) s
>
>   f2 () at a.cpp:10
>
>   10        f1();
>
>   (gdb) ptype foo
>
>   type = struct (anonymous namespace)::foo {
>
>       int i;
>
>   }
>
>   (gdb) p &f1
>
>   $2 = (void (*)(void)) 0x400500 <f1()>
>
> So in that case, the foo type and f1 function are properly identified
> depending on the context in which the expression is evaluated.
>
> If the types and subprograms are tied into the same CU (with a bit of
> manual IR editing), two distinct types and functions are emitted into the
> DWARF, but GDB only finds the first one, every time, in both contexts.
>
>
> To look into the larger motivation, cross-CU inlining and how it interacts
> with Fission:
>
>   a.cpp:
>
>     __attribute__((optnone)) void f1() {
>
>     }
>
>     __attribute__((always_inline)) void f2() {
>
>       f1();
>
>     };
>
>   b.cpp:
>
>     void f2();
>
>     void f3() {
>
>       f2();
>
>     }
>   c.cpp:
>
>     void f3();
>
>     int main() {
>
>       f3();
>
>     }
>
>   $ clang++ {a,b}.cpp -g -c -emit-llvm -S
>   $ llvm-link {a,b}.ll -S -o ab.ll
>   $ clang++-tot c.cpp ab.ll -gsplit-dwarf
>   $ dwp {c,ab}.dwo -o cab.dwp
>
> This produces two dwo files - one per object file (c.dwo, ab.dwo). If we
> look at the contents, they make a fair bit of sense:
>
>   0x0b: DW_TAG_compile_unit [1] *
>
>   0x19:   DW_TAG_subprogram [2]
>
>   0x25:   DW_TAG_subprogram [3]
>
>   0x31:   DW_TAG_subprogram [4]
>
>   0x43: DW_TAG_compile_unit [1] *
>
>   0x51:   DW_TAG_subprogram [5] *
>
>   0x5d:     DW_TAG_inlined_subroutine [6]
>
>               DW_AT_abstract_origin [DW_FORM_ref_addr]      (0x31 "_Z2f2v")
>
> The ref_addr can be resolved relative to this whole .dwo file & the
> abstract subprogram is found. Yay.
>
> So GDB can't currently handle this:
>
>   Could not find DWO CU ab.dwo(0x32dd6d7121dd1d9a) referenced by CU at
> offset 0x66 [in module a.out]
>
> And that's after some local fixes I have so it doesn't error out on the
> two-CUs-in-a-dwo situation (that fix at least allows the first example
> (local foo/f1, etc) to work with GDB - but fixing ref_addr to resolve
> correctly would require deeper changes I haven't figured out/prototyped
> yet).
>
> So, up until this point I was pretty satisfied there would be a way
> forward...
>
> Then I hit DWP files.
>
> DWP files represent multiple DWO files stacked together - mostly to
> deduplicate strings and type units as a sort of archival (like dsym) format.
>
> The way DWPs work, roughly (full/more details here:
> https://gcc.gnu.org/wiki/DebugFissionDWP ) is that all the sections are
> concatenated together, except the debug_str and debug_str_offsets sections
> which have special handling to do what you'd expect - deduplicate strings,
> and adjust the str_offsets to point to the right offsets in the
> deduplicated string section.
>
> The other thing that DWP has, is an index, or two. cu_index and tu_index.
> We'll just look at cu_index.
>
> A cu_index for the cab.dwp above, is:
>
> Index Signature          INFO     ABBR     LINE     STR_OFF
>
> ----- ------------------ -------- -------- -------- --------
>
>     2 0x7bd765349b7e7631 [2d, 65) [38, ae) [11, 22) [14, 3c)
>
>     8 0x66f4e160661d2687 [00, 2d) [00, 38) [00, 11) [00, 14)
>
>    11 0x32dd6d7121dd1d9a [65, 98) [38, ae) [11, 22) [14, 3c)
>
>
> What this is for is to tell the consumer, which portions of each section
> relate to which CUs - since the DIEs in the CU don't contain any
> relocations, for example the abbrev offset in the CU header is probably
> zero. It must be resolved relative to the ABBR section in the above table
> to find the chunk of the debug_abbrev that came from that CU.
>
> So, this is all good and well, except that the INFO range isn't the whole
> range of the debug info from the DWO file - it's /just/ the range of this
> CU. Which means there's no way to resolve the ref_addr relative to the
> whole range, you don't know where it starts.
>
>
>
> Ah thanks, that explains my question above!
>
>
>
> So this leads us to a few options - the 'simplest', that changes the
> semantics but not the syntax of the table - would be to widen the INFO
> range to cover the whole portion that came from the original DWO file.
>
> Consumers would have to be adjusted to cope with the fact that the INFO
> contribution for a given DWO ID would be a range that contains the CU
> somewhere, along with other CUs - and they'd have to search (in full LTO,
> this would mean searching through /all/ the CUs).
>
>
>
> Would generating a .dwo per CU help with this?
>
>
>
> This doesn't cover all the use cases I'd have in mind, but it would at
> least be possible to implement & support ref_addr. The full depth of design
> choices I'll likely leave to the dwarf-discuss thread, hopefully.
>
>
>
> Yes, given all the non-LLVM tools that will need to support this, that is
> the right forum to discuss this.
>
>
>
> -- adrian
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170504/eb454881/attachment.html>