[llvm-dev] DWARF Fission + ThinLTO
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Wed May 3 19:48:50 PDT 2017
On Wed, May 3, 2017 at 7:43 PM Adrian Prantl <aprantl at apple.com> wrote:
> On May 3, 2017, at 2:59 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Wed, May 3, 2017 at 2:09 PM Adrian Prantl <aprantl at apple.com> wrote:
>
>>
>> > On May 3, 2017, at 2:00 PM, David Blaikie <dblaikie at gmail.com> wrote:
>> >
>> > So Dehao and I have been dealing with some of the nitty gritty details
>> of debug info with ThinLTO, specifically with Fission(Split DWARF).
>> >
>> > This applies to LTO as well, so I won't single out ThinLTO here.
>> >
>> > 1) Multiple CUs in a .dwo file
>> > Clang/LLVM produces a CU for each original source file - these CUs are
>> kept through IR linking (thin or full) and produced as distinct CUs in the
>> resulting DWARF.
>> > This preserves semantics as much as possible - eg: file-local functions
>> and types (those in anonymous namespaces or declared file-static) con
>> co-exist even if their names collide.
>> > GDB does good things with this, except with Fission.
>>
>> Can you lay out the terminology you are using here? Is Fission as used
>> here different/more than creating .dwo files?
>>
>
> Fission as used in that sentence means .dwo files and anything beyond
> (.dwp files - https://gcc.gnu.org/wiki/DebugFissionDWP ). GDB
> warns/errors/complains if two CUs are in a single .dwo, and ignores all but
> the first.
>
>
> It sounds like having more than one CU per .dwo is outside of the
> intention of the DWARF specification (though not explicitly forbidden),
> since there is an implied 1-1 relationship between skeleton CU and .dwo.
>
Right - it's certainly 'fuzzy' territory at least.
> What do you think is the *right* solution here:
> - Allowing more than one CU in gdb?
> - Emitting many little .dwo files for the cross-inlined functions' CUs?
>
I'm not sure that this ^ would help - because if each CU was in a separate
.dwo, there would be no way to reference a subprogram in one CU from an
inlined_subroutine in another CU.
So I think going on from that implies the need for multiple CUs in a single
.dwo file.
Eric & I have bounced around a few ideas about whether, with that in mind,
the idea of a skeleton CU could be changed - where a single skeleton per
.dwo file is used, but relates to multiple CUs. (which gets awkward on the
other side - with multiple CUs with the same DWO id or the like)
I'm not sure what'll be the right long term solution in that regard.
>
>
>> > Binutils DWP produces usable DWPs from DWO files with multiple CUs
>> >
>> > 2) Cross-CU references
>> > This is where it gets trickier.
>> > LLVM produces cross-CU references (DW_FORM_ref_addr) to refer to types
>> or functions defined in one CU used from another. This only happens with
>> (thin or plain) LTO. LLVM's had this for a while.
>> > This helps fully express interesting cases outlined in (1) - it's
>> possible that a file-local function is inlined into another CU - by using
>> ref_addr the semantics described in (1) can be preserved, while also
>> explaining the inlining that has occurred. (similar cases can arise with
>> intra-CU type references too)
>> > GDB handles this, except with Fission (I mean, it doesn't get this far
>> - but even with patches to handle (1)+Fission, it's still not enough -
>> needs more work)
>> > Binutils DWP and the DWP format in general... maybe can't cope with
>> this.
>> >
>> > Talking about (2)+DWP:
>>
>> You mean (2)+DWO+DWP?
>>
>
> DWP is a package containing the contents of multiple DWOs - so, yes and
> no? Not sure how to answer.
>
> (2)+DWO could be made to work without changing much in the
> contents/format/etc, I think - consumers could be made to understand the
> 'obvious' form (no change to producers I think would be needed).
>
> But once you go to a DWP file, then there are repreesntational problems
> that make it currently impossible to use cross-CU references. The semantics
> (& possibly the syntax) of DWP files would have to change to enable this
> functionality.
>
>
> Why can a .dwo cope with cross-CU-references but not a .dwp?
>
Found below (as you mentioned - just mentioning it here for threading, etc)
>
>
>
>>
>> > It looks like this may require adjusting at least the semantics, if not
>> the syntax of the indexes in the DWP file. I've started a thread on
>> dwarf-discuss to start trying to hash that out.
>> >
>> >
>> >
>> > So, to cut a long story short: Fission+LTO is currently unusable and
>> may take a while to make the DWARF side of this usable.
>> >
>> > In the interim, I'd like to propose adding a flag to LLVM to support
>> something not very nice: When merging modules (or importing things into a
>> module in ThinLTO), do not maintain separate CUs but redirect the cu
>> pointer in any DISubprogram (& probably DIGlobalVariable too - I forget if
>> ThinLTO can import them) metadata to refer to the original CU in the
>> destination module. (in the full LTO case, it'd also be necessary to import
>> any retained types)
>> >
>> > This flag could be passed to the task doing the merging/importing, or
>> could be passed to the frontend compile and stashed in per-CU metadata
>> (where it would be respected when importing/merging from that CU).
>> >
>> > The intent would be to support this flag until the DWARF functionality
>> can be decided on and implemented - and kept for some time after that for
>> backwards compatibility for those who want to use Fission but haven't got
>> the latest toolchains with the fixes/improvements in them.
>> >
>>
>
> If the other tools cannot be fixed quickly, then this would be a pragmatic
> interim solution.
>
I'll hopefully have a patch (starting with ThinLTO) to discuss later this
week.
>
> >
>> >
>> > How's this sound to everyone? Reasonable? Unreasonable? Need more
>> details about the impact of these changes, etc? (I have examples with DWARF
>> output, GDB behavior, etc)
>>
>> If you can post an example, that would help.
>>
>
> OK, so, here's an example of how GDB, without fission, benefits from the
> two distinct CUs (rather than bundling all the types & functions into a
> single CU):
>
> given:
> a.cpp:
> namespace {
> struct foo {
> int i;
> };
> }
> static void f1() {
> }
> void f2() {
> f1();
> }
> b.cpp:
> namespace {
> struct foo {
> float f;
> };
> }
> static void f1() {
> }
> void f2();
> int main() {
> f1();
> f2();
> }
> $ clang++ {a,b}.cpp -g -c -emit-llvm -S
> $ llvm-link {a,b}.ll -S -o ab.ll
> $ clang++-tot ab.ll
> $ gdb a.out
> (gdb) start
> ..., main () at b.cpp:11
> 11 f1();
> (gdb) ptype foo
> type = struct (anonymous namespace)::foo {
> float f;
> }
> (gdb) p &f1
> $1 = (void (*)(void)) 0x400530 <f1()>
> (gdb) n
> 12 f2();
> (gdb) s
> f2 () at a.cpp:10
> 10 f1();
> (gdb) ptype foo
> type = struct (anonymous namespace)::foo {
> int i;
> }
> (gdb) p &f1
> $2 = (void (*)(void)) 0x400500 <f1()>
>
> So in that case, the foo type and f1 function are properly identified
> depending on the context in which the expression is evaluated.
>
> If the types and subprograms are tied into the same CU (with a bit of
> manual IR editing), two distinct types and functions are emitted into the
> DWARF, but GDB only finds the first one, every time, in both contexts.
>
>
> To look into the larger motivation, cross-CU inlining and how it interacts
> with Fission:
>
> a.cpp:
> __attribute__((optnone)) void f1() {
> }
> __attribute__((always_inline)) void f2() {
> f1();
> };
> b.cpp:
> void f2();
> void f3() {
> f2();
> }
> c.cpp:
> void f3();
> int main() {
> f3();
> }
> $ clang++ {a,b}.cpp -g -c -emit-llvm -S
> $ llvm-link {a,b}.ll -S -o ab.ll
> $ clang++-tot c.cpp ab.ll -gsplit-dwarf
> $ dwp {c,ab}.dwo -o cab.dwp
>
> This produces two dwo files - one per object file (c.dwo, ab.dwo). If we
> look at the contents, they make a fair bit of sense:
>
> 0x0b: DW_TAG_compile_unit [1] *
> 0x19: DW_TAG_subprogram [2]
> 0x25: DW_TAG_subprogram [3]
> 0x31: DW_TAG_subprogram [4]
> 0x43: DW_TAG_compile_unit [1] *
> 0x51: DW_TAG_subprogram [5] *
> 0x5d: DW_TAG_inlined_subroutine [6]
> DW_AT_abstract_origin [DW_FORM_ref_addr] (0x31 "_Z2f2v")
>
> The ref_addr can be resolved relative to this whole .dwo file & the
> abstract subprogram is found. Yay.
>
> So GDB can't currently handle this:
>
> Could not find DWO CU ab.dwo(0x32dd6d7121dd1d9a) referenced by CU at
> offset 0x66 [in module a.out]
>
> And that's after some local fixes I have so it doesn't error out on the
> two-CUs-in-a-dwo situation (that fix at least allows the first example
> (local foo/f1, etc) to work with GDB - but fixing ref_addr to resolve
> correctly would require deeper changes I haven't figured out/prototyped
> yet).
>
> So, up until this point I was pretty satisfied there would be a way
> forward...
>
> Then I hit DWP files.
>
> DWP files represent multiple DWO files stacked together - mostly to
> deduplicate strings and type units as a sort of archival (like dsym) format.
>
> The way DWPs work, roughly (full/more details here:
> https://gcc.gnu.org/wiki/DebugFissionDWP ) is that all the sections are
> concatenated together, except the debug_str and debug_str_offsets sections
> which have special handling to do what you'd expect - deduplicate strings,
> and adjust the str_offsets to point to the right offsets in the
> deduplicated string section.
>
> The other thing that DWP has, is an index, or two. cu_index and tu_index.
> We'll just look at cu_index.
>
> A cu_index for the cab.dwp above, is:
>
> Index Signature INFO ABBR LINE STR_OFF
> ----- ------------------ -------- -------- -------- --------
> 2 0x7bd765349b7e7631 [2d, 65) [38, ae) [11, 22) [14, 3c)
> 8 0x66f4e160661d2687 [00, 2d) [00, 38) [00, 11) [00, 14)
> 11 0x32dd6d7121dd1d9a [65, 98) [38, ae) [11, 22) [14, 3c)
>
> What this is for is to tell the consumer, which portions of each section
> relate to which CUs - since the DIEs in the CU don't contain any
> relocations, for example the abbrev offset in the CU header is probably
> zero. It must be resolved relative to the ABBR section in the above table
> to find the chunk of the debug_abbrev that came from that CU.
>
> So, this is all good and well, except that the INFO range isn't the whole
> range of the debug info from the DWO file - it's /just/ the range of this
> CU. Which means there's no way to resolve the ref_addr relative to the
> whole range, you don't know where it starts.
>
>
> Ah thanks, that explains my question above!
>
Great!
>
>
>
> So this leads us to a few options - the 'simplest', that changes the
> semantics but not the syntax of the table - would be to widen the INFO
> range to cover the whole portion that came from the original DWO file.
>
> Consumers would have to be adjusted to cope with the fact that the INFO
> contribution for a given DWO ID would be a range that contains the CU
> somewhere, along with other CUs - and they'd have to search (in full LTO,
> this would mean searching through /all/ the CUs).
>
>
> Would generating a .dwo per CU help with this?
>
Then there would be no way to reference one CU to another to describe
cross-CU inlining, which wouldn't be any good.
>
> This doesn't cover all the use cases I'd have in mind, but it would at
> least be possible to implement & support ref_addr. The full depth of design
> choices I'll likely leave to the dwarf-discuss thread, hopefully.
>
>
> Yes, given all the non-LLVM tools that will need to support this, that is
> the right forum to discuss this.
>
> -- adrian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170504/dfbb1211/attachment-0001.html>
More information about the llvm-dev
mailing list