[llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug info in lld.

Thu Jun 4 14:30:54 PDT 2020

On 2020-06-03, James Henderson via llvm-dev wrote:
>It makes me sad that the linker (via a library or otherwise) has to be
>"DWARF-aware" to be able to effectively handle --gc-sections, COMDATs,
>--icf etc for debug info, without leaving large blocks of data kicking
>around.
>
>The patching to -1 (or equivalent) is probably a good lightweight solution
>(though I'd love it if it could be done based on section type in the future
>rather than section name, but that's probably outside the realm of DWARF),
>as it requires only minimal understanding in the linker, but anything
>beyond that seems to be complicated logic that is mostly due to the
>structure of DWARF. Patching to -1 does feel a bit like a sticking
>plaster/band aid to patch over the issue rather than properly solving it
>too - there will still be debug data (potentially significant amounts in
>COMDAT-heavy objects) that the linker has to write and the debugger has to
>somehow know how to skip (even if it knows that -1 is special-case due to
>the standard being updated, it needs to get as far as the -1), which is all
>wasted effort.
>
>We've already seen from Alexey's prototyping, and from our own experiences
>with the Sony proprietary linker (which tried to rewrite .debug_line only)
>that deconstructing the DWARF so that it can be more optimally reassembled
>at link time is slow going, and will probably inevitably be however much
>effort is put into optimising it. For a start, given the current standards,
>it's impossible to know how to deconstruct it without having to parse vast
>amounts of DWARF, which is typically going to mean a lot more parsing work
>than the linker would normally have to deal with. Additionally, much of
>this parsing work is wasted effort, since it seems unlikely in many links
>that large amounts of the DWARF will be redundant. Having an option to
>opt-in doesn't help much there, since it just means the logic exists
>without most people using it, due to it not being good enough, or
>potentially they don't even know it exists.
>
>I don't have particularly concrete suggestions as to how to solve the
>structural problems with DWARF at this point. The only thing that seems
>obvious to me is a more "blessed" approach to fragmentation of sections,
>similar to what I tried with my prototype mentioned earlier in the thread,
>although we'd need to figure out the previously stated performance issues.
>Other ideas might tie into this, like somehow sharing the various table
>headers a bit like CIEs in .eh_frame that could be merged by the linker -
>each object could have separate table header sections, which are referenced
>by the individual .debug_* blocks, which in turn are one per function/data
>piece and easily discardable/merged by the linker.
>
>Just some thoughts.
>
>James

Your proposed option --dead-reloc-addend=.debug_info=0xffffffffffffffff
seems like a good idea.  (I'd expect it to support signed -1 and -2 for
convenience & consistency in some other places (we sometimes use addends
as signed values)).

LLD only supports absolute relocation types (plus R_PPC64_DTPREL64 which
can go to .debug_addr, plus R_RISCV_{ADD,SUB}*).

The computed value is S + A.
We still consider the symbolic value S as zero, but override A with the
supplied option --dead-reloc-addend=.debug_info=-1
I particularly like that `addend` is part of the option name.

My mere complaint is that the relocation record is not dead, but rather
its referenced symbol is dead. However, I can't think of a better
name...

Checked with Martin Storsjö, this option may be useful for other binary
formats supporting DWARF. (binutils does not like ELF-specific options
not called -z foobar).
I think it is fine to add this option to LLD if GNU ld is also happy
with the name.  I'll check with them.

"There is a danger that one community won't accept an extension that
they haven't been involved in the design process for." :) (Coutesy of Peter)

The built-in rules of the linker are the following:

--dead-reloc-addend=.debug_loc=-2
--dead-reloc-addend=.debug_ranges=-2
--dead-reloc-addend=.debug_*=-1

They can be overridden.

>On Tue, 2 Jun 2020 at 19:24, David Blaikie via llvm-dev <
>llvm-dev at lists.llvm.org> wrote:
>
>> On Tue, May 19, 2020 at 7:17 AM Alexey Lapshin
>> <alapshin at accesssoftek.com> wrote:
>> >
>> > Hi David, please find my comments inside:
>> >
>> >
>> > >>>Broad question: Do you have any specific motivation/users/etc in
>> implementing this (if you can speak about it)?
>> >
>> > >>> - it might help motivate the work, understand what tradeoffs might
>> be suitable for you/your users, etc.
>> >
>> > >>There are two general requirements:
>> > >> 1) Remove (or clean) invalid debug info.
>> >
>> > >
>> > >Perhaps a simpler direct solution for your immediate needs might be a
>> much narrower,
>> > >and more efficient linker-DWARF-awareness feature:
>> > >
>> > > With DWARFv5, rnglists present an opportunity for a DWARF linker to
>> rewrite the ranges
>> > > without parsing the rest of the DWARF. /technically/ this isn't
>> guaranteed - rnglist entries
>> > > can be referenced either directly, or by index. If all rnglists are
>> referenced by index, then
>> > > a linker could parse only the debug_rnglists section and rewrite
>> ranges to remove any
>> > > address ranges that refer to optimized-out code.
>> > >
>> > > This would only be correct for rnglists that had no direct references
>> to them (that only were
>> > > referenced via the indexes) - but we could either implement it with
>> that assumption, or could
>> > > add an LLVM extension attribute on the CU that would say "I promise I
>> only referenced rnglists
>> > > via rnglistx forms/indexes). If this DWARF-aware linking would have to
>> read the CU DIE (not
>> > > all the other DIEs) it /could/ also then rewrite high/low_pc if the CU
>> wasn't using ranges...
>> > > but that wouldn't come up in the function-removal case, because then
>> you'd have ranges anyway,
>> > > so no need for that.
>> > >
>> > > Such a DWARF-aware rnglist linking could also simplify rnglists, in
>> cases where functions
>> > > ended up being laid out next to each other, the linker could coalesce
>> their ranges together.
>> > >
>> > > I imagine this could be implemented with very little overhead to
>> linking, especially compared
>> > > to the overhead of full DWARF-aware linking.
>> > >
>> > >Though none of this fixes Split DWARF, where the linker doesn't get a
>> chance to see the
>> > > addresses being used - but if you only want/need the CU-level ranges
>> to be correct, this
>> > > might be a viable fix, and quite efficient.
>> >
>> > Yes, we think about that alternative. This would resolve our problem of
>> invalid debug info
>> > and would work much faster. Thus, if we would not have good results for
>> D74169 then we
>> > will implement it. Do you think it could be useful to have this solution
>> in upstream?
>>
>> A pure rnglist rewriting - I think it'd be OK to have in upstream -
>> again, cost/benefit/etc would have to be weighed. I'm not sure it
>> would save enough space to be particularly valuable beyond the
>> correctness issue - and it doesn't completely solve the correctness
>> issue for zero-address usage or low-address usage (because you could
>> still have overlapping subprograms inside a CU - so if you were
>> symbolizing you could use the correct rnglist to filter, but then go
>> look inside the CU only to find two subprograms that had that address
>> & not know which one was the correct one an which one was the
>> discarded one).
>>
>> rnglist rewriting might be easy enough to prototype - but depends what
>> you want to spend your time on, I know this whole issue has been a
>> huge investment of your time already - but maybe this recent
>> revitalization of the conversation around having an explicit value in
>> the linker might be sufficient to address everyone's needs... *fingers
>> crossed*)
>>
>>
>> > >> 2) Optimize the DWARF size.
>> >
>> >
>> > > Do your users care much about this? I imagine if they had significant
>> DWARF size issues,
>> > > they'd have significant link time issues and the kind of cost to link
>> time this feature has would
>> > > be prohibitive - but perhaps they're sharing linked binaries much more
>> often than they're
>> > > actually performing linking.
>> >
>> > Yes, they do. They also have significant link-time issues.
>> > So current performance results of D74169 are not very acceptable.
>> > We hope to improve it.
>> >
>> >
>> >
>> > >>The specifics which our users have:
>> > >>  - embedded platform which uses 0 as start of .text section.
>> > >>  - custom toolset which does not support all features yet(f.e. split
>> dwarf).
>> > >>  - tolerant of the link-time increase.
>> > >>  - need a useful way to share debug builds.
>> >
>> >
>> > > Sharing two files (executable and dwp) is significantly less useful
>> than sharing one file?
>> >
>> > Probably not significantly, but yes, it looks less useful comparing to
>> D74169.
>> > Having only two files (executable and .dwp) looks significantly better
>> than having executable and multiple .dwo files.
>> > Having only one file(executable) with minimal size looks better than the
>> two files with a bigger size.
>> >
>> > clang compiled with -gsplitdwarf takes 0.9G for executable and 0.9G for
>> .dwp.
>> > clang compiled with -gc-debuginfo takes only 0.76G for single executable.
>> >
>> >
>> >
>> > >>For the first point: we have a problem "Overlapping address ranges
>> starting from 0"(D59553).
>> >
>> > >>We use custom solution, but the general solution like D74169 would be
>> better here.
>> >
>> >
>> > > If CU ranges are the only ones that need fixing, then I think the
>> above solution might be as
>> > > good/better - if more than CU ranges need fixing, then I think we
>> might want to start talking about
>> > > how to fix DWARF itself (split and non-split) to signal certain
>> addresses point to dead code with a
>> > > specific blessed value that linkers would need to implement - because
>> with Split DWARF there's
>> > > no way to solve the non-CU addresses at the linker.
>> >
>> > I think the worthful solution for that signal value would be LowPC >
>> HighPC.
>> > That does not require additional bits in DWARF.
>> > It would be natural to skip such address ranges since they explicitly
>> marked as invalid.
>> > It could be implemented in a linker very easily. Probably, it would make
>> sense to describe that
>> > usage in DWARF standard.
>> >
>> > As to the addresses which are not seen by the linker(since they are in
>> .dwo files) - yes,
>> > they need to have another solution. Could you show an example of such a
>> case, please?
>> >
>> >
>> >
>> > >>>2. Support of type units.
>> >
>> > >>>
>> >
>> > >>>>  That could be implemented further.
>> >
>> > >>>Enabling type units increases object size to make it easier to
>> deduplicate at link time by a DWARF-unaware
>> >
>> > >>>linker. With a DWARF aware linker it'd be generally desirable not to
>> have to add that object size overhead to
>> >
>> > >>>get the linking improvements.
>> >
>> > >>
>> >
>> > >>But, DWARFLinker should adequately work with type units since they are
>> already implemented.
>> >
>> >
>> > > Maybe - it'd be nice & all, but I don't think it's an outright
>> necessity - if someone knows they're using
>> > > a DWARF-aware linker, they'd probably not use type units in their
>> object files. It's possible someone
>> > > doesn't know for sure & maybe they have pre-canned debug object files
>> from someone else, etc.
>> >
>> > I see.
>> >
>> > >>Another thing is that the idea behind type units has the potential to
>> help Dwarf-aware linker to work faster.
>> >
>> > >>Currently, DWARFLinker analyzes context to understand whether types
>> are the same or not.
>> >
>> >
>> > >When you say "analyzes context" what do you mean? Usually I'd take that
>> to mean
>> > > "looks at things outside the type itself - like what namespace it's
>> in, etc" - which, yes,
>> > > it should do that, but it doesn't seem very expensive to do. But I
>> guess you actually
>> > > mean something about doing structural equivalence in some way, looking
>> at things inside the type?
>> >
>> > I think it could be useful for both cases. Currently, dsymutil does only
>> first thing
>> > (look at type name, namespace name, etc..) and does not do the second
>> thing
>> > (doing structural equivalence). Analyzing type names is currently quite
>> expensive
>> > (the only search in string pool takes ~10 sec from 70 sec of overall
>> time).
>> > That is expensive because of many things should be done to work with
>> strings:
>> > parse DWARF, search and resolve relocations, compute a hash for strings,
>> > put data into a string pool, create a fully qualified name(like
>> namespace::function::name).
>> > It looks like it could be optimized and finally require less time, but
>> it still would be a noticeable
>> > part of the overall time.
>> >
>> > If dsymutil starts to check for the structural equivalence, then the
>> process would be even more slowly.
>> > So, If instead of comparing types structure, there would be checked
>> single hash-id - then this process
>> > would also be faster.
>> >
>> > Thus I think using hash-id to compare types would allow to make current
>> implementation faster and would
>> > allow handling incomplete types by DWARFLinker without massive
>> performance degradation also.
>> >
>> > >> But the context is known when types are generated. So, no need to
>> spent the time analyzing it.
>> >
>> > >> If types could be compared without analyzing context, then
>> Dwarf-aware linker would work faster.
>> >
>> > >> That is just an idea(not for immediate implementation): If types
>> would be stored in some "type table"
>> >
>> > >> (instead of COMDAT section group) and could be accessed through
>> hash-id(like type units
>> >
>> > >> - then it would be the solution requiring fewer bits to store but
>> allowing to compare types
>> >
>> > >> by hash-id(not analysing context).
>> > >> In this case, size increasing would be small. And processing time
>> could be done faster.
>> > >>
>> > >> this is just an idea and could be discussed separately from the
>> problem of integrating of D74169.
>> >
>> > >> >> 6. -flto=thin
>> >
>> > >> >>    That problem was described in this review
>> https://reviews.llvm.org/D54747#1503720. It also exists in
>> >
>> > >> >> current DWARFLinker/dsymutil implementation. I think that problem
>> should be discussed more: it could
>> >
>> > >> >> probably be fixed by avoiding generation of such incomplete
>> declaration during thinlto,
>> >
>> > >> >> That would be costly to produce extra/redundant debug info in
>> ThinLTO - actually ThinLTO could be doing
>> >
>> > >> >> more to reduce that redundancy early on (actually removing
>> definitions from some llvm Modules if the type
>> >
>> > >> >> definition is known to exist in another Module, etc)
>> > >> >I don't know if it's a problem since that patch was reverted.
>> >
>> > >>
>> >
>> > >> Yes. That patch was reverted, but this patch(D74169) has the same
>> problem.
>> >
>> > >> if D74169 would be applied and --gc-debuginfo used then structure type
>> > >> definition would be removed.
>> >
>> > >> DWARFLinker could handle that case - "removing definitions from some
>> llvm Modules if the type
>> > >> definition is known to exist in another Module".
>> > >> i.e. DWARFLinker could replace the declaration with the definition.
>> >
>> > >> But that problem could be more easily resolved when debug info is
>> generated(probably without
>> > >> significant increase of debug info size):
>> >
>> > >> Here we have:
>> >
>> > >> DW_TAG_compile_unit(0x0000000b) - compile unit containing concrete
>> instance for function "f".
>> > >> DW_TAG_compile_unit(0x00000073) - compile unit containing abstract
>> instance root for function "f".
>> > >> DW_TAG_compile_unit(0x000000c1) - compile unit containing function
>> "f" definition.
>> >
>> > >> Code for function "f" was deleted. gc-debuginfo deletes compile unit
>> DW_TAG_compile_unit(0x000000c1)
>> > >> containing "f" definition (since there is no corresponding code). But
>> it has structure "Foo" definition
>> > >> DW_TAG_structure_type(0x0000011e) referenced from
>> DW_TAG_compile_unit(0x00000073)
>> > >> by declaration DW_TAG_structure_type(0x000000ae). That declaration is
>> exactly the case when definition
>> > >> was removed by thinlto and replaced with declaration.
>> >
>> > >> Would it cost too much if type definition would not be replaced with
>> declaration for "abstract instance root"?
>> > >> The number of concrete instances is bigger than number of abstract
>> instance roots.
>> > >> Probably, it would not be too costly to leave definition in abstract
>> instance root?
>> >
>> >
>> >
>> > >> Alternatively, Would it cost too much if type definition would not be
>> replaced with declaration when
>> > >> declaration references type from not used function? (lto could
>> understand that concrete function is not used).
>> >
>> >
>> > >I don't follow this example - could you provide a small concrete test
>> case I could reproduce?
>> >
>> > I would provide a test case if necessary. But it looks like this issue
>> is finally clear, and you already commented on that.
>> >
>> > > Oh, I guess this is happening perhaps because ThinLTO can't know for
>> sure that a standalone
>> > > definition of 'f' won't be needed - so it produces one in case one of
>> the inlining opportunities
>> > > doesn't end up inlining. Then it turns out all calls got inlined, so
>> the external definition wasn't needed.
>> >
>> > > Oh, you're suggesting that these 3 CUs got emitted into one object
>> file during LTO, but that DWARFLinker
>> > > drops a CU without any code in it - even though... So far as I know,
>> in LTO, LLVM directly references
>> > > types across units if the CUs are all emitted in the same object file.
>> (and if they weren't in the same
>> > > object file - then the abstract_origin couldn't be pointing cross-CU).
>> >
>> > > I guess some basic things to say:
>> >
>> > > With ThinLTO, the concrete/standalone function definition is emitted
>> in case some call sites don't end up
>> > > being inlined. So we know it'll be emitted (but might not be needed by
>> the actual linker)
>> > > ANy number of inline calls might exist - but we shouldn't put the type
>> information into those, because
>> > > they aren't guaranteed to emit it (if the inline function gets
>> optimized away, there would be nothing to
>> > > enforce the type being emitted) - and even if we forced the type
>> information to be emitted into one
>> > > object file that has an inline copy of the function - there's no
>> guarantee that object file will get linked in either.
>> >
>> > > So, no, I don't think there's much we can do to keep the size of
>> object files down, while guaranteeing
>> > > the type information will be emitted with the usual linker semantics.
>> >
>> > Then dsymutil/DWARFLinker could be changed to handle that(though it
>> would probably be not very efficient).
>> > If thinlto would understand that function is not used finally(and then
>> must not contain referenced type definition),
>> > then this situation could be handled more effectively.
>> >
>> > Thank you, Alexey.
>> >
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> LLVM Developers mailing list
>> >>> llvm-dev at lists.llvm.org
>> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>

>_______________________________________________
>LLVM Developers mailing list
>llvm-dev at lists.llvm.org
>https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev