[llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug info in lld.
David Blaikie via llvm-dev
llvm-dev at lists.llvm.org
Mon Jun 15 20:01:37 PDT 2020
Just a general point to this whole thread:
Linker-based DWARF redundancy/dead-DWARF elimination isn't really a
feature I/Google (in the parts of it I'm involved with) would use. We
use Split DWARF internally & mostly have issues with object size, not
so much with linked executable size - so on multiple fronts, this work
probably wouldn't be deployed in the parts of Google I work with.
& I'm not sure how much Alexey needs this - the original proposal to
remove dead DWARF was as a way to address the 0-as-a-valid-address
issue due to lack of a good tombstone. Now that we're moving forward
on the -2/-1 tombstone thing - I'm not sure if any of us (in this
community/thread) have a deeply pressing need to remove redundant/dead
DWARF any more than we did last week/month/etc.
That said, I do find it a fun/interesting topic & am enjoying playing
around with linker/object features & seeing what can be done here,
what the tradeoffs might be, etc - I just don't want to be misleading
in my level of investment here. Not sure about other folks if anyone
ends up fully prototyping this and the object size/linked executable
size tradeoffs are worthwhile, etc. It'd be interesting to see.
On Tue, Jun 9, 2020 at 10:41 AM Fangrui Song <maskray at google.com> wrote:
>
> On 2020-06-08, David Blaikie via llvm-dev wrote:
> >"On Thu, Jun 4, 2020 at 3:11 PM Robinson, Paul <paul.robinson at sony.com> wrote:
> >>
> >> + Ben Dunbobbin, whose name I take in vain below.
> >> He's my local expert on weird ELF features.
> >>
> >> > -----Original Message-----
> >> > From: David Blaikie <dblaikie at gmail.com>
> >> > Sent: Thursday, June 4, 2020 2:43 PM
> >> > To: Robinson, Paul <paul.robinson at sony.com>
> >> > Cc: jh7370.2008 at my.bristol.ac.uk; llvm-dev at lists.llvm.org
> >> > Subject: Re: [llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug info
> >> > in lld.
> >> >
> >> > On Thu, Jun 4, 2020 at 8:27 AM Robinson, Paul <paul.robinson at sony.com>
> >> > wrote:
> >> > >
> >> > >
> >> > >
> >> > > > -----Original Message-----
> >> > > > From: David Blaikie <dblaikie at gmail.com>
> >> > > > Sent: Wednesday, June 3, 2020 5:31 PM
> >> > > > To: Robinson, Paul <paul.robinson at sony.com>
> >> > > > Cc: jh7370.2008 at my.bristol.ac.uk; llvm-dev at lists.llvm.org
> >> > > > Subject: Re: [llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug
> >> > info
> >> > > > in lld.
> >> > > >
> >> > > > On Wed, Jun 3, 2020 at 6:34 AM Robinson, Paul <paul.robinson at sony.com>
> >> > > > wrote:
> >> > > > >
> >> > > > > DWARF was designed in an era when COMDAT and ICF were not a thing,
> >> > or at
> >> > > > least not common, certainly not when talking about function code. The
> >> > > > overhead of a unit occurred only once per translation unit, so that
> >> > > > expense was reasonably amortized.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > Splitting functions into their own object-file sections and making
> >> > them
> >> > > > excludable is an evolution of compiler/linker technology that DWARF
> >> > has
> >> > > > not kept up with. The linker-friendly solutions (COMDAT DWARF) would
> >> > put
> >> > > > function-related .debug_* contributions into a section-group along
> >> > with
> >> > > > the function .text itself; this multiplies the total number of
> >> > sections to
> >> > > > deal with, regardless of the tactics used for the content of each per-
> >> > > > function DWARF section. The fully DWARF-conformant solution would
> >> > create
> >> > > > one partial_unit per function, with the corresponding overhead of unit
> >> > > > headers (especially painful in the .debug_line section).
> >> > Alternatively we
> >> > > > fragment DWARF into sections without headers and rely on the linker to
> >> > > > make everything look right in the linked executable; this produces .o
> >> > > > files that are not DWARF conformant (unless we can standardize this in
> >> > > > DWARF v6) and would be a big hassle for consumers other than the
> >> > linker.
> >> > > >
> >> > > > "object files don't contain DWARF, but they contain stuff that the
> >> > > > linker will turn into DWARF" wouldn't seem like the worst thing to me
> >> > > > - what sort of pre-linking parsing of DWARF use cases do you have in
> >> > > > mind, other than for our own compiler development uses?
> >> > >
> >> > > No, that wouldn't seem like the worst thing. Obviously llvm-dwarfdump
> >> > > would want to be able to report what's actually happening, but indeed
> >> > > all the other use-cases that come to mind are not looking at .o files.
> >> > >
> >> > > > (notwithstanding in-object Split DWARF (where the .dwo sections would
> >> > > > have to be remain usable without linking) or the MachO style debug
> >> > > > info distribution model which is similar)
> >> > >
> >> > > I expect Split DWARF would be incompatible with fragments. I don't
> >> > > know details about MachO but seems likely the same is true there.
> >> >
> >> > Yep, if they're sub-contribution regions, that wouldn't play well with
> >> > Split DWARF. (& full contribution isolation have the DWARF header
> >> > overhead, etc)
> >> >
> >> > I'd still be concerned about the ELF header overhead even of this
> >> > sub-contribution scheme, but could be interesting to see how it plays
> >> > out in practice.
> >> >
> >> > All that said, to avoid burying the lede here, I'll splice something
> >> > from the end up here:
> >> >
> >> > > Although the point is not to avoid tombstone values, but to do a more
> >> > efficient job of editing the final DWARF to omit gc'd functions; it's no
> >> > problem at all to use a tombstone value in .debug_addr IMO.
> >> >
> >> > But the tombstone values are Alexey's underlying issue (this ongoing
> >> > design discussion for over a year now) & /sort/ of mine too recently
> >> > (which, unfortunately, is what's reinvigoraetd this discussion -
> >> > would've been nice if I/we/someone had identified this sooner &
> >> > could've helped Alexey in a more timely manner): Alexey is dealing
> >> > with a platform where 0 is a valid address so the lld/gold strategy of
> >> > resolving relocations to dead code to "0+addend" creates ambiguous
> >> > DWARF. I'm dealing with a case of zero-length functions ("int f1() {
> >> > }" or "void f2() { __builtin_unreachable(); }") causing early
> >> > termination of DWARFv4 range lists.
> >> >
> >> > The reason for the DWARF-aware linker proposal was because the "let's
> >> > choose a better tombstone" discussion didn't go anywhere & people sort
> >> > of encouraged in this direction of "what if we didn't need a
> >> > tombstone/the linker fixed up the debug info instead". So if the DWARF
> >> > redundancy elimination doesn't address the issue of zero as a valid
> >> > address, it doesn't address Alexey's needs, unfortunately. :/
> >>
> >> But, upthread we had a tombstone discussion IIRC, which seemed to converge
> >> on "-1 except .debug_loc/.debug_ranges use -2" didn't it? If we're still
> >> going on about having the linker rewriting DWARF, then the fragmenting
> >> idea is worth pursuing as an alternative to Alexey's current work.
> >>
> >> >
> >> > That said, I super appreciate the time you've put into writing this up
> >> > and it is valuable & I'd love to see some (even hand-crafted assembly)
> >> > prototypes, maybe do some back-of-the-envelope numbers to see whether
> >> > the ELF header overhead would be worth it, etc.
> >>
> >> It would be nice to verify that the section-fragment idea would produce
> >> something that looked usable. Hand-written assembly... would require
> >> research into how to specify the right section attributes, but would
> >> likely be less effort than trying to make LLVM do something plausible.
> >>
> >> I'll see about creating an internal task for this.
> >>
> >> >
> >> > > > But even then, I'm not sure how viable it would be - as Fangrui
> >> > > > pointed out on another thread about this: ELF section overhead itself
> >> > > > is non-trivial ("sizeof(Elf64_Shdr) = 64.") & it would probably be
> >> > > > rather difficult to reconstruct header-less slice-and-dicable sections
> >> > > > in some cases. For type information (a reduced overhead version of
> >> > > > -fdebug-types-section) I could see it - but for functions, they need
> >> > > > to refer to addresses - preferably in the debug_addr section, and
> >> > > > that's accessed by index, so taking chunks out of it would break other
> >> > > > references to it, etc... adding the header would be expensive, and how
> >> > > > would the CU construct its DW_AT_ranges value if that has to be sliced
> >> > > > and diced? Again, some amount of linker magic might solve some of
> >> > > > these problems - but I think there's still a lot of overhead to making
> >> > > > a solution that's workable with a DWARF-agnostic linker (or even with
> >> > > > a DWARF aware one, but in an efficient amount of time/space where it's
> >> > > > not only usable for small programs, or for linking when you're
> >> > > > shipping a final production binary, etc)
> >> > >
> >> > > The idea we have blue-skied internally would work something like this
> >> > > (initially explicated in terms of the .debug_info section, then seeing
> >> > > how that tactic applies to other sections):
> >> > >
> >> > > There's a top fragment, containing the CU header and the CU DIE itself.
> >> > > Linker magic makes this first in the output file.
> >> >
> >> > Quick curiosity: Is there existing linker magic for this? What does it
> >> > look like? I'd love to know so I can play around with hand crafted
> >> > prototypes/keep it in mind for such things.
> >>
> >> Ben Dunbobbin did research into this some time ago, under the auspices
> >> of a "COMDAT DWARF" investigation. He's part of Sony's linker team, and
> >> it was a discussion with that team where I became convinced that the
> >> fragmenting idea was feasible using existing defined ELF capabilities,
> >> although perhaps in ways nobody had really taken advantage of. It
> >> involved section groups and/or section ordering, but somebody much more
> >> familiar with ELF than I am would have to explain it. I've cc'd Ben.
> >>
> >> Regarding my discussion with our linker team:
> >> They asked me whether it was feasible to use sections to subset the
> >> DWARF, and I described the functional need (top & bottom fragments,
> >> arbitrary stuff in between) and they thought the ELF section-group
> >> and/or section-ordering features would be able to provide that.
> >>
> >> I'm not aware that anyone actually tried prototyping that. The work
> >> that James did (mentioned upthread) IIRC was using COMDAT and full
> >> units with unit headers. My fading memory suggests the discussion
> >> described just above was after that.
> >
> >Actually the thread Fangrui linked got me thinking & so I poked
> >around, according to <
> >https://docs.oracle.com/cd/E19683-01/816-1386/chapter6-94076/index.html>
> > it actually doesn't require as much magic as I thought it might:
> >
> >"In the absence of the sh_link ordering information, sections from a
> >single input file combined within one section of the output file will
> >be contiguous and have the same relative ordering as they did in the
> >input file. The contributions from multiple input files will appear in
> >link-line order."
>
> The official ELF specification (acknowledged by multiple parties,
> Linux, *BSD, HP-UX, Solaris, haiku, etc) is
> http://www.sco.com/developers/gabi/latest/contents.html We need to read
> the Solaris Linker and Libraries Guide with a grain of salt.
Thanks for the link!
> The special section indexes SHN_BEFORE and SHN_AFTER are currently
> Solaris specific and none of GNU ld, gold, LLD recognizes them.
> (If we find needs, we can consider them)
>
> The sh_link field of SHF_LINK_ORDER is currently used by !associated.
> I need to read more what we can do with the field. https://reviews.llvm.org/D72904
>
> Linkers do retain the input file order. AFAICT this is guaranteed by the
> ELF specification. In practice many linkers do this. (LLD has an option
> which changes this convention: --shuffle-sections=seed)
Great!
> >OK, so at the basic level, we could have a debug_info starting
> >section, some number of comdat debug_info sections, and the debug_info
> >tail section (the null terminator).
> >
> >Keeping types alive/cross-referencing them might be trickier (since
> >their liveness isn't just tied to an existing real comdat), but
> >do-able.
> >
> >Yeah, might try and work that up one of these days...
> >
> >Turns out I decided to do it now. Some of this relates to
> >threaded-later-than-this-email replies I've read by the time I'm
> >writing this email.
> >
> >Yep - for DIEs that don't need to be referenced (such as subprogram
> >DIEs - assuming they aren't the target of a call_site - all global
> >variable DIEs (don't think there's any way to target them in LLVM's
> >current DWARF emission) using comdats and relying on the linker's
> >guarantee (which is at least documented for some Unix linkers: "In the
> >absence of the sh_link ordering information, sections from a single
> >input file combined within one section of the output file will be
> >contiguous and have the same relative ordering as they did in the
> >input file. The contributions from multiple input files will appear in
> >link-line order." -
> >https://docs.oracle.com/cd/E19683-01/816-1386/chapter6-94076/index.html
> >"When not otherwise constrained, sections should be emitted in input
> >order." - https://web.stanford.edu/~ouster/cgi-bin/cs140-spring18/pintos/specs/sysv-abi-update.html/ch4.sheader.html
> >)
> >
> >So that works a treat, exactly as Paul suggested.
> >
> >Also works to modify the debug_ranges/rnglists section keeping list
> >fragments in the same comdat group too - though the object size
> >overhead there is probably higher than worthwhile, given the size of
> >the contributions versus the size of ELF sections/groups/etc. If
> >someone really wants to trim their ranges/rnglists down for size - I
> >think a linker feature would be suitable there, because the section is
> >small/format is simpler (hmm, except addrx forms - those would require
> >parsing CU DIEs to get the addr_base to know which addresses were
> >referenced by the range list, etc... :/).
> >
> >Doing this with types is a bit more difficult (yes, type units exist,
> >but I was wondering if we could avoid some of their overhead with
> >techniques like this) - I'm not sure how to make a hunk of .debug_info
> >that gets dropped if no symbol in it is referenced. (-gc-sections
> >didn't seem to activate on the hunk of .debug_info I tried... I guess
> >if that worked then all the debug_info would be dropped all the time -
> >so we'd need some way to specifically opt-in ). Attempts at using an
> >external symbol so type hunks could be deduplicated by comdat, with
> >cross-CU references resolved by symbol - but that doesn't seem to have
> >worked out (the final value used for the symbolic reference is not the
> >desired value... I'm probably holding it wrong).
>
> In general, --gc-sections retains a non-SHF_ALLOC section
> if it is not associated to another SHF_ALLOC section (via SHF_LINK_ORDER,
> SHT_RELA, or a section group).
Ah, well, that presents a workaround: an empty .text comdat group to
associate with the DWARF type fragment...
> If we want --gc-sections to be effectful on fragmented .debug_*, we'll
> need to carefully construct section references (via relocations).
What sort of relocations did you have in mind? I have managed to use
the above (empty .text comdat to associate with DWARF descriptions of
types - then using a relocation to a symbol in the .debug_info type
fragment from the .debug_info function fragment (using a function's
parameter type as the test case here)) - this does the right thing
(dropping the type if all the function fragments that refer to it are
dropped by -gc-sections, for instance).
Looks something roughly like:
.section .text,"axG", at progbits,_Z2f13foo,comdat,unique,1
...
.section .text,"axG", at progbits,_Z2f23foo,comdat,unique,1
...
.section .text,"axG", at progbits,_Z3foo,comdat,unique,1
# empty .text comdat to make the .debug_info type comdat droppable
.section .debug_info,"", at progbits,unique,1
# DWARF header/CU DIE/etc
.section .debug_info,"G", at progbits,_Z3foo,comdat,unique,1
# label for type DIE - we'd need a more advanced mangling scheme for
# this to ensure it doesn't overlap with C++, etc
_Z3foo:
# type DIEs
.section .debug_info,"G", at progbits,_Z2f13foo,comdat,unique,1
# DWARF fragment for 'f1'
...
.long _Z3foo # DW_AT_type
...
# similar fragment for f2
.section .debug_info,"", at progbits,unique,2
# DWARF footer
.byte 0 # End Of Children Mark
.Ldebug_info_end0:
But I have some trouble making that _Z3foo relocation to do what I
want across multiple files. So if two files have code like the above -
the _Z3foo comdat is picked from one of them (if one of the ".long
_Z3foo" is retained from that file), and the ".long _Z3foo" references
are resolved correctly to refer to the offset in the .debug_info
_Z3foo comdat group. But the ".long _Z3foo" references from the other
input file are resolved to zero. Any chance of making that do what I
want?
(honestly, this is probably all too high object overhead for some
users - I mean, it doesn't apply to my/Google's use case at all, given
Split DWARF - but maybe other users would be happy with a
linker-agnostic small-linked-DWARF result even at the cost of
significant object size)
> Lookup table style sections (.debug_addr .debug_str_offsets) will
> definitely be difficult to merge. SHF_MERGE (constant merging) is an
> optional ELF feature which most linkers implement, but it does not allow
> section headers/footers (these DWARF v5 sections all have a header) or
> varying entry sizes.
Yeah, I think they're more or less a lost cause because of the
indexing in them. I don't have any great ideas for them. Generally you
want a DWARF-aware link of those sections to create a single more
efficient lookup table, so that can take into account dropped code,
etc, potentially.
- Dave
> SHF_MERGE
> The data in the section may be merged to eliminate duplication. Unless
> the SHF_STRINGS flag is also set, the data elements in the section are
> of a uniform size. The size of each element is specified in the section
> header's sh_entsize field. If the SHF_STRINGS flag is also set, the data
> elements consist of null-terminated character strings. The size of each
> character is specified in the section header's sh_entsize field.
>
> Since SHF_MERGE isn't usable, performing any constant merging requires
> some DWARF awareness :/
>
> >
> >
> >> > (basically the ability for an object file to say "here's the start and
> >> > end of my contribution to this section, and some bits that /can/ go in
> >> > the middle, but you can drop them if you like")
> >> >
> >> > > Types also go here; certainly base types, and other file-scope types
> >> > > can be included here or put into type units. (Type units aren't
> >> > > fragmented, they are their own thing same as always.)
> >> >
> >> > Separately, it might be worth considering putting types in such a
> >> > thing - but, yes, the "How do you reference them when they might be in
> >> > your unit or someone else's unit", etc, would have to be figured out.
> >> > I guess using an external symbol might be the solution there - again,
> >> > with a better understanding of the ^ mentioned linker magic, I'd
> >> > probably play around with hand crafting some examples just to see how
> >> > this could work.
> >> >
> >> > > There's a matching bottom fragment, which is just the terminating NULL
> >> > > for the CU DIE; linker magic makes this last in the output file.
> >> >
> >> > Last of all the contributions from this object file, not last in the
> >> > whole output file, right? (please excuse the pedantry, just double
> >> > checking)
> >>
> >> The object file would (loosely speaking) have a ".debug_info.first",
> >> some number of ".debug_info.excludable-middle", and a ".debug_info.last"
> >> which would all be glommed together in first-middle-last order in the
> >> output .debug_info section. I believe I was told that this would be
> >> per-object-file, otherwise yeah it wouldn't work at all.
> >>
> >> This is why we need input from somebody who actually knows ELF.
> >>
> >> >
> >> > > Each function has its own fragment, which is in the same link-group
> >> > > (COMDAT or whatever) as the function's .text section; that way, if the
> >> > > function is discarded, so is the .debug_info fragment. Offhand I can't
> >> > > think of any cases (other than DW_AT_specification, addressed below) of
> >> > > references to a subprogram DIE from elsewhere,
> >> >
> >> > The call_site DWARF would want to refer to a subprogram DIE, but that
> >> > could be handled by (first pass) having a declaration subprogram in
> >> > the initial fragment that the call_site could refer to using the usual
> >> > assembler-resolved CU-relative offset. Of course that'd mean a bunch
> >> > of (probably the bigger part) of the function's DWARF footprint
> >> > wouldn't be deduplicated, but would address this part of the address
> >> > tombstone issue (if not using debug_addr) & reduce some of the DWARF -
> >> > the addresses are pretty big (if you're not pooling them), etc.
> >>
> >> Ah, forgot about call_site. Yeah referring to a declaration should work.
> >>
> >> >
> >> > > so it should be fine to
> >> > > discard the entire function fragment as needed. Linker magic puts all
> >> > > function fragments between the top and bottom fragments, in some
> >> > > indeterminate order. Each function fragment is the usual complete
> >> > > subtree, rooted in DW_TAG_subprogram.
> >> >
> >> > Rooted at the top level (well, below the DW_TAG_compile_unit) DIE, as
> >> > you mention later - namespace, or whatever else.
> >>
> >> Right, each fragment would be a complete subtree that would ordinarily
> >> be a direct child of DW_TAG_compile_unit. With whatever DIE it needed.
> >>
> >> >
> >> > > References to types are either
> >> > > to type units as normal, or to types in the top fragment. Note that
> >> > > these references do not require relocations; type units are by signature
> >> > > as always, and for types in the top fragment, the offsets into the top
> >> > > fragment are known at compile time.
> >> > >
> >> > > Inlined functions are described as part of the function they have been
> >> > > inlined into, being children of the function DIE. DW_AT_specification
> >> > > refers to the abstract declaration which is in its own fragment (or the
> >> > > top fragment, but that keeps the declaration from being elided if all
> >> > > references go away).
> >> >
> >> > Yep, this overlaps with the call_site stuff I mentioned earlier - same
> >> > ideas. Either top fragment, or its own fragment. Keeping its own
> >> > fragment alive, and figuring out how to reference it (depending on
> >> > fragment layout/elision) would require some work, but I think it's
> >> > do-able. Might even be do-able so it can be deduplicated across CUs
> >> > (use a sec_offset form, use a linker-resolved relocation to it) - this
> >> > infrastructure would overlap with type deduplication without type
> >> > units too.
> >> >
> >> > Though linker resolved relocations add more bytes...
> >> >
> >> > > If functions are inside namespaces, each function fragment will need
> >> > > to have namespace DIEs around the function DIE. This adds overhead
> >> > > but it's pretty small.
> >> > >
> >> > > I hand-wave filling in the CU header's unit length. I'd expect a
> >> > > relocation with a reference to the bottom fragment should be able to
> >> > > compute the correct value.
> >> >
> >> > *nod*
> >> >
> >> > > That's the story for .debug_info; what about other sections?
> >> > >
> >> > > Sections referenced by index from .debug_info can't be fragmented;
> >> > > this would be: .debug_abbrev, .debug_addr, .debug_str_offsets.
> >> > >
> >> > > .debug_str doesn't need to be fragmented, linkers DTRT already.
> >> >
> >> > (linkers deduplicate debug_str - but can they be made to remove
> >> > unreferenced strings too? in that cas ewe'd have an interesting
> >> > tradeoff of maybe using FORM_strp rather than strx - if we wanted the
> >> > linker to be able to drop strings from dropped function definitions,
> >> > etc)
> >>
> >> Future refinements are quite possible!
> >>
> >> >
> >> > > .debug_macro contents are not tied to functions and won't be fragmented.
> >> > >
> >> > > .debug_loclists and .debug_rnglists should be fragmentable the same
> >> > > way as .debug_info; they exist only as extensions of .debug_info, and
> >> > > the range list for the CU itself is merely a concatenated set of
> >> > > contributions from each constituent function, so that should Just Work
> >> > > (although it won't be optimal, adjacent ranges won't be coalesced).
> >> >
> >> > At least the way we currently emit loclists and rnglists is by using
> >> > an index (the header of loclists and rnglists has an index to offset
> >> > mapping) - like strx, this would make it hard/impossible for a
> >> > DWARF-agnostic linker to see through to find out which indexes were
> >> > actually used. We could potentially not use the loclistx/rnglistx
> >> > forms/indexes from fragments - instead using sec_offsets that would
> >> > make them relocatable/removable/etc. (so long as all the index-based
> >> > referenced lists came in the debug_loclist/debug_rnglist header
> >> > fragment)
> >>
> >> Ah, I hadn't looked at how we do those lists. But sounds solvable.
> >>
> >> >
> >> > > I believe the same is true for .debug_loc and .debug_ranges, although
> >> > > I haven't checked.
> >> >
> >> > Yep, those ones are easier - there's no contribution header, they can
> >> > only be referenced via sec_offset, so slicing and dicing them is
> >> > cheap.
> >> >
> >> > But the tombstone problem still exists for the CU's debug_ranges -
> >> > though /maybe/ it could be carefully constructed from fragments...
> >> > that's going to be a /lot/ of sections in the end though.
> >> >
> >> > > .debug_aranges is functionally equivalent to the CU rangelist.
> >> >
> >> > Yup. (as we've touched on before, we don't use aranges at Google -
> >> > instead relying on CU's ranges which are just a little more expensive
> >> > to retrieve - but no need to duplicate the data in both places - if
> >> > consumers really find the aranges worthwhile to avoid parsing a few
> >> > attributes on the CU DIE, perhaps a future spec could let
> >> > debug_aranges reference a range list? so that aranges and the CU could
> >> > share the same data?)
> >> >
> >> > > .debug_line can work the same way as .debug_info but is worth a word.
> >> > > The top fragment has the header, including the directory/file lists
> >> > > because those are referenced by index. DW_LNE_define_file can't be
> >> > > used. Each function has a fragment containing the sequence for that
> >> > > function, starting with set_address and ending with end_sequence.
> >> > > The bottom fragment is empty, existing only to allow the length to
> >> > > be computed.
> >> >
> >> > Yep - can't remove dead file and directory names, unfortunately - and
> >> > the line table's pretty compact, so not sure it'd be a great savings
> >> > (especially compared to the ELF section overhead - at the object file
> >> > size at least (though probably a small win for linked executable
> >> > size)). Chances are those strings (now in debug_line_str) would be
> >> > used /somewhere/ in the program, so linker string deduplication would
> >> > get most of the wins - just dead offset entries in the line table
> >> > header.
> >>
> >> Sony does squeeze out the sequences for dead functions; I think it's
> >> not a huge win, in terms of total debug info size, but the .debug_line
> >> section does not let you skip dead sequences; you still have to parse
> >> the whole thing. Our debugger guys were pleased at not having to
> >> spend time doing something that useless. (Yeah it does mean the
> >> linker has to parse the whole .debug_line section; but our theory is
> >> that you probably run the debugger more than you run the linker, and
> >> in any case you do it interactively, so debugger load time is probably
> >> more annoying than some fractional increase in build/link time.)
> >>
> >> The dir/file tables can't be squeezed, but one expects it's not a
> >> huge cost with .debug_line_str having lots of deduplication
> >> opportunities.
> >>
> >> >
> >> > > .debug_line_str is a string section and requires nothing special.
> >> > >
> >> > > .debug_names ... haven't looked at it but I suspect either it doesn't
> >> > > survive or it has to be generated post-link (or by the linker).
> >> >
> >> > Generally you're going to want a DWARF-aware linker for debug_names,
> >> > same as gdb-index, etc.
> >> >
> >> > > .debug_frame I *think* can be fragmented, but I haven't take the
> >> > > time to look at it to make sure.
> >> > >
> >> > > Those are all the sections I see in DWARF v5 Appendix B.
> >> > >
> >> > > So that's the blue-sky vision of linker-magic COMDAT DWARF, which
> >> > > took me about an hour to write down just now. There is certainly
> >> > > a non-trivial overhead in terms of ELF sections; in the general
> >> > > case we would have 5 per-function fragments (for .debug_info,
> >> > > .debug_line, .debug_rnglists, .debug_loclists, .debug_aranges).
> >> > >
> >> > > Not small, but then other features in the works are using huge
> >> > > quantities of ELF sections too (section-per-basic-block).
> >> >
> >> > That work's being scoped to be fairly selective about which basic
> >> > blocks it puts in unique sections - just those that are especially
> >> > performance sensitive, so the cost isn't as high as you might
> >> > otherwise imagine. Adding 5 new sections per function would be
> >> > probably a significantly larger growth than anything else I'm aware
> >> > of, but I haven't run the numbers by any means.
> >>
> >> Doing it for *every* function would be the worst case, for when
> >> you're trying to squeeze everything (gc + icf). We could likely
> >> get wins if we did it just for the functions that today end up in
> >> a COMDAT section (inline functions, template instantiations) which
> >> previous research has found to be pretty significant (and major
> >> motivation for the Program Repository work that we've previously
> >> described at a Dev Meeting, https://llvm.org/devmtg/2016-11/#talk22)
> >>
> >> >
> >> > Thanks again for the write up!
> >>
> >> NP, it was fun to trot out this stuff.
> >> --paulr
> >>
> >> >
> >> > - Dave
> >> >
> >> > > > & as always, not sure how any of this would work for Split DWARF -
> >> > > > just a debug_adr section that has some addresses that point to
> >> > > > discardable functions... if we want those addresses themselves to be
> >> > > > discardable (so we don't have to use a tombstone value inserted by the
> >> > > > linker) then they'd need to be in separate debug_addr contributions
> >> > > > with headers, etc - the overhead just seems too high to me in all the
> >> > > > ways I can look at that.
> >> > >
> >> > > Yeah I think .dwo sections can't take advantage of fragmenting, and
> >> > > .debug_addr is referenced by index so it can't be fragmented. Although
> >> > > the point is not to avoid tombstone values, but to do a more efficient
> >> > > job of editing the final DWARF to omit gc'd functions; it's no problem
> >> > > at all to use a tombstone value in .debug_addr IMO.
> >> > > --paulr
> >> > >
> >> > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > Or we pay the cost of parsing, trimming, and rewriting all the DWARF
> >> > in
> >> > > > the linker.
> >> > > > >
> >> > > > > --paulr
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of James
> >> > > > Henderson via llvm-dev
> >> > > > > Sent: Wednesday, June 3, 2020 3:48 AM
> >> > > > > To: David Blaikie <dblaikie at gmail.com>
> >> > > > > Cc: llvm-dev at lists.llvm.org
> >> > > > > Subject: Re: [llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete
> >> > debug
> >> > > > info in lld.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > It makes me sad that the linker (via a library or otherwise) has to
> >> > be
> >> > > > "DWARF-aware" to be able to effectively handle --gc-sections, COMDATs,
> >> > --
> >> > > > icf etc for debug info, without leaving large blocks of data kicking
> >> > > > around.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > The patching to -1 (or equivalent) is probably a good lightweight
> >> > > > solution (though I'd love it if it could be done based on section type
> >> > in
> >> > > > the future rather than section name, but that's probably outside the
> >> > realm
> >> > > > of DWARF), as it requires only minimal understanding in the linker,
> >> > but
> >> > > > anything beyond that seems to be complicated logic that is mostly due
> >> > to
> >> > > > the structure of DWARF. Patching to -1 does feel a bit like a sticking
> >> > > > plaster/band aid to patch over the issue rather than properly solving
> >> > it
> >> > > > too - there will still be debug data (potentially significant amounts
> >> > in
> >> > > > COMDAT-heavy objects) that the linker has to write and the debugger
> >> > has to
> >> > > > somehow know how to skip (even if it knows that -1 is special-case due
> >> > to
> >> > > > the standard being updated, it needs to get as far as the -1), which
> >> > is
> >> > > > all wasted effort.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > We've already seen from Alexey's prototyping, and from our own
> >> > > > experiences with the Sony proprietary linker (which tried to rewrite
> >> > > > .debug_line only) that deconstructing the DWARF so that it can be more
> >> > > > optimally reassembled at link time is slow going, and will probably
> >> > > > inevitably be however much effort is put into optimising it. For a
> >> > start,
> >> > > > given the current standards, it's impossible to know how to
> >> > deconstruct it
> >> > > > without having to parse vast amounts of DWARF, which is typically
> >> > going to
> >> > > > mean a lot more parsing work than the linker would normally have to
> >> > deal
> >> > > > with. Additionally, much of this parsing work is wasted effort, since
> >> > it
> >> > > > seems unlikely in many links that large amounts of the DWARF will be
> >> > > > redundant. Having an option to opt-in doesn't help much there, since
> >> > it
> >> > > > just means the logic exists without most people using it, due to it
> >> > not
> >> > > > being good enough, or potentially they don't even know it exists.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > I don't have particularly concrete suggestions as to how to solve
> >> > the
> >> > > > structural problems with DWARF at this point. The only thing that
> >> > seems
> >> > > > obvious to me is a more "blessed" approach to fragmentation of
> >> > sections,
> >> > > > similar to what I tried with my prototype mentioned earlier in the
> >> > thread,
> >> > > > although we'd need to figure out the previously stated performance
> >> > issues.
> >> > > > Other ideas might tie into this, like somehow sharing the various
> >> > table
> >> > > > headers a bit like CIEs in .eh_frame that could be merged by the
> >> > linker -
> >> > > > each object could have separate table header sections, which are
> >> > > > referenced by the individual .debug_* blocks, which in turn are one
> >> > per
> >> > > > function/data piece and easily discardable/merged by the linker.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > Just some thoughts.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > James
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Tue, 2 Jun 2020 at 19:24, David Blaikie via llvm-dev <llvm-
> >> > > > dev at lists.llvm.org> wrote:
> >> > > > >
> >> > > > > On Tue, May 19, 2020 at 7:17 AM Alexey Lapshin
> >> > > > > <alapshin at accesssoftek.com> wrote:
> >> > > > > >
> >> > > > > > Hi David, please find my comments inside:
> >> > > > > >
> >> > > > > >
> >> > > > > > >>>Broad question: Do you have any specific motivation/users/etc
> >> > in
> >> > > > implementing this (if you can speak about it)?
> >> > > > > >
> >> > > > > > >>> - it might help motivate the work, understand what tradeoffs
> >> > might
> >> > > > be suitable for you/your users, etc.
> >> > > > > >
> >> > > > > > >>There are two general requirements:
> >> > > > > > >> 1) Remove (or clean) invalid debug info.
> >> > > > > >
> >> > > > > > >
> >> > > > > > >Perhaps a simpler direct solution for your immediate needs might
> >> > be a
> >> > > > much narrower,
> >> > > > > > >and more efficient linker-DWARF-awareness feature:
> >> > > > > > >
> >> > > > > > > With DWARFv5, rnglists present an opportunity for a DWARF linker
> >> > to
> >> > > > rewrite the ranges
> >> > > > > > > without parsing the rest of the DWARF. /technically/ this isn't
> >> > > > guaranteed - rnglist entries
> >> > > > > > > can be referenced either directly, or by index. If all rnglists
> >> > are
> >> > > > referenced by index, then
> >> > > > > > > a linker could parse only the debug_rnglists section and rewrite
> >> > > > ranges to remove any
> >> > > > > > > address ranges that refer to optimized-out code.
> >> > > > > > >
> >> > > > > > > This would only be correct for rnglists that had no direct
> >> > > > references to them (that only were
> >> > > > > > > referenced via the indexes) - but we could either implement it
> >> > with
> >> > > > that assumption, or could
> >> > > > > > > add an LLVM extension attribute on the CU that would say "I
> >> > promise
> >> > > > I only referenced rnglists
> >> > > > > > > via rnglistx forms/indexes). If this DWARF-aware linking would
> >> > have
> >> > > > to read the CU DIE (not
> >> > > > > > > all the other DIEs) it /could/ also then rewrite high/low_pc if
> >> > the
> >> > > > CU wasn't using ranges...
> >> > > > > > > but that wouldn't come up in the function-removal case, because
> >> > then
> >> > > > you'd have ranges anyway,
> >> > > > > > > so no need for that.
> >> > > > > > >
> >> > > > > > > Such a DWARF-aware rnglist linking could also simplify rnglists,
> >> > in
> >> > > > cases where functions
> >> > > > > > > ended up being laid out next to each other, the linker could
> >> > > > coalesce their ranges together.
> >> > > > > > >
> >> > > > > > > I imagine this could be implemented with very little overhead to
> >> > > > linking, especially compared
> >> > > > > > > to the overhead of full DWARF-aware linking.
> >> > > > > > >
> >> > > > > > >Though none of this fixes Split DWARF, where the linker doesn't
> >> > get a
> >> > > > chance to see the
> >> > > > > > > addresses being used - but if you only want/need the CU-level
> >> > ranges
> >> > > > to be correct, this
> >> > > > > > > might be a viable fix, and quite efficient.
> >> > > > > >
> >> > > > > > Yes, we think about that alternative. This would resolve our
> >> > problem
> >> > > > of invalid debug info
> >> > > > > > and would work much faster. Thus, if we would not have good
> >> > results
> >> > > > for D74169 then we
> >> > > > > > will implement it. Do you think it could be useful to have this
> >> > > > solution in upstream?
> >> > > > >
> >> > > > > A pure rnglist rewriting - I think it'd be OK to have in upstream -
> >> > > > > again, cost/benefit/etc would have to be weighed. I'm not sure it
> >> > > > > would save enough space to be particularly valuable beyond the
> >> > > > > correctness issue - and it doesn't completely solve the correctness
> >> > > > > issue for zero-address usage or low-address usage (because you could
> >> > > > > still have overlapping subprograms inside a CU - so if you were
> >> > > > > symbolizing you could use the correct rnglist to filter, but then go
> >> > > > > look inside the CU only to find two subprograms that had that
> >> > address
> >> > > > > & not know which one was the correct one an which one was the
> >> > > > > discarded one).
> >> > > > >
> >> > > > > rnglist rewriting might be easy enough to prototype - but depends
> >> > what
> >> > > > > you want to spend your time on, I know this whole issue has been a
> >> > > > > huge investment of your time already - but maybe this recent
> >> > > > > revitalization of the conversation around having an explicit value
> >> > in
> >> > > > > the linker might be sufficient to address everyone's needs...
> >> > *fingers
> >> > > > > crossed*)
> >> > > > >
> >> > > > >
> >> > > > > > >> 2) Optimize the DWARF size.
> >> > > > > >
> >> > > > > >
> >> > > > > > > Do your users care much about this? I imagine if they had
> >> > > > significant DWARF size issues,
> >> > > > > > > they'd have significant link time issues and the kind of cost to
> >> > > > link time this feature has would
> >> > > > > > > be prohibitive - but perhaps they're sharing linked binaries
> >> > much
> >> > > > more often than they're
> >> > > > > > > actually performing linking.
> >> > > > > >
> >> > > > > > Yes, they do. They also have significant link-time issues.
> >> > > > > > So current performance results of D74169 are not very acceptable.
> >> > > > > > We hope to improve it.
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > >>The specifics which our users have:
> >> > > > > > >> - embedded platform which uses 0 as start of .text section.
> >> > > > > > >> - custom toolset which does not support all features yet(f.e.
> >> > > > split dwarf).
> >> > > > > > >> - tolerant of the link-time increase.
> >> > > > > > >> - need a useful way to share debug builds.
> >> > > > > >
> >> > > > > >
> >> > > > > > > Sharing two files (executable and dwp) is significantly less
> >> > useful
> >> > > > than sharing one file?
> >> > > > > >
> >> > > > > > Probably not significantly, but yes, it looks less useful
> >> > comparing to
> >> > > > D74169.
> >> > > > > > Having only two files (executable and .dwp) looks significantly
> >> > better
> >> > > > than having executable and multiple .dwo files.
> >> > > > > > Having only one file(executable) with minimal size looks better
> >> > than
> >> > > > the two files with a bigger size.
> >> > > > > >
> >> > > > > > clang compiled with -gsplitdwarf takes 0.9G for executable and
> >> > 0.9G
> >> > > > for .dwp.
> >> > > > > > clang compiled with -gc-debuginfo takes only 0.76G for single
> >> > > > executable.
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > >>For the first point: we have a problem "Overlapping address
> >> > ranges
> >> > > > starting from 0"(D59553).
> >> > > > > >
> >> > > > > > >>We use custom solution, but the general solution like D74169
> >> > would
> >> > > > be better here.
> >> > > > > >
> >> > > > > >
> >> > > > > > > If CU ranges are the only ones that need fixing, then I think
> >> > the
> >> > > > above solution might be as
> >> > > > > > > good/better - if more than CU ranges need fixing, then I think
> >> > we
> >> > > > might want to start talking about
> >> > > > > > > how to fix DWARF itself (split and non-split) to signal certain
> >> > > > addresses point to dead code with a
> >> > > > > > > specific blessed value that linkers would need to implement -
> >> > > > because with Split DWARF there's
> >> > > > > > > no way to solve the non-CU addresses at the linker.
> >> > > > > >
> >> > > > > > I think the worthful solution for that signal value would be LowPC
> >> > >
> >> > > > HighPC.
> >> > > > > > That does not require additional bits in DWARF.
> >> > > > > > It would be natural to skip such address ranges since they
> >> > explicitly
> >> > > > marked as invalid.
> >> > > > > > It could be implemented in a linker very easily. Probably, it
> >> > would
> >> > > > make sense to describe that
> >> > > > > > usage in DWARF standard.
> >> > > > > >
> >> > > > > > As to the addresses which are not seen by the linker(since they
> >> > are in
> >> > > > .dwo files) - yes,
> >> > > > > > they need to have another solution. Could you show an example of
> >> > such
> >> > > > a case, please?
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > >>>2. Support of type units.
> >> > > > > >
> >> > > > > > >>>
> >> > > > > >
> >> > > > > > >>>> That could be implemented further.
> >> > > > > >
> >> > > > > > >>>Enabling type units increases object size to make it easier to
> >> > > > deduplicate at link time by a DWARF-unaware
> >> > > > > >
> >> > > > > > >>>linker. With a DWARF aware linker it'd be generally desirable
> >> > not
> >> > > > to have to add that object size overhead to
> >> > > > > >
> >> > > > > > >>>get the linking improvements.
> >> > > > > >
> >> > > > > > >>
> >> > > > > >
> >> > > > > > >>But, DWARFLinker should adequately work with type units since
> >> > they
> >> > > > are already implemented.
> >> > > > > >
> >> > > > > >
> >> > > > > > > Maybe - it'd be nice & all, but I don't think it's an outright
> >> > > > necessity - if someone knows they're using
> >> > > > > > > a DWARF-aware linker, they'd probably not use type units in
> >> > their
> >> > > > object files. It's possible someone
> >> > > > > > > doesn't know for sure & maybe they have pre-canned debug object
> >> > > > files from someone else, etc.
> >> > > > > >
> >> > > > > > I see.
> >> > > > > >
> >> > > > > > >>Another thing is that the idea behind type units has the
> >> > potential
> >> > > > to help Dwarf-aware linker to work faster.
> >> > > > > >
> >> > > > > > >>Currently, DWARFLinker analyzes context to understand whether
> >> > types
> >> > > > are the same or not.
> >> > > > > >
> >> > > > > >
> >> > > > > > >When you say "analyzes context" what do you mean? Usually I'd
> >> > take
> >> > > > that to mean
> >> > > > > > > "looks at things outside the type itself - like what namespace
> >> > it's
> >> > > > in, etc" - which, yes,
> >> > > > > > > it should do that, but it doesn't seem very expensive to do. But
> >> > I
> >> > > > guess you actually
> >> > > > > > > mean something about doing structural equivalence in some way,
> >> > > > looking at things inside the type?
> >> > > > > >
> >> > > > > > I think it could be useful for both cases. Currently, dsymutil
> >> > does
> >> > > > only first thing
> >> > > > > > (look at type name, namespace name, etc..) and does not do the
> >> > second
> >> > > > thing
> >> > > > > > (doing structural equivalence). Analyzing type names is currently
> >> > > > quite expensive
> >> > > > > > (the only search in string pool takes ~10 sec from 70 sec of
> >> > overall
> >> > > > time).
> >> > > > > > That is expensive because of many things should be done to work
> >> > with
> >> > > > strings:
> >> > > > > > parse DWARF, search and resolve relocations, compute a hash for
> >> > > > strings,
> >> > > > > > put data into a string pool, create a fully qualified name(like
> >> > > > namespace::function::name).
> >> > > > > > It looks like it could be optimized and finally require less time,
> >> > but
> >> > > > it still would be a noticeable
> >> > > > > > part of the overall time.
> >> > > > > >
> >> > > > > > If dsymutil starts to check for the structural equivalence, then
> >> > the
> >> > > > process would be even more slowly.
> >> > > > > > So, If instead of comparing types structure, there would be
> >> > checked
> >> > > > single hash-id - then this process
> >> > > > > > would also be faster.
> >> > > > > >
> >> > > > > > Thus I think using hash-id to compare types would allow to make
> >> > > > current implementation faster and would
> >> > > > > > allow handling incomplete types by DWARFLinker without massive
> >> > > > performance degradation also.
> >> > > > > >
> >> > > > > > >> But the context is known when types are generated. So, no need
> >> > to
> >> > > > spent the time analyzing it.
> >> > > > > >
> >> > > > > > >> If types could be compared without analyzing context, then
> >> > Dwarf-
> >> > > > aware linker would work faster.
> >> > > > > >
> >> > > > > > >> That is just an idea(not for immediate implementation): If
> >> > types
> >> > > > would be stored in some "type table"
> >> > > > > >
> >> > > > > > >> (instead of COMDAT section group) and could be accessed through
> >> > > > hash-id(like type units
> >> > > > > >
> >> > > > > > >> - then it would be the solution requiring fewer bits to store
> >> > but
> >> > > > allowing to compare types
> >> > > > > >
> >> > > > > > >> by hash-id(not analysing context).
> >> > > > > > >> In this case, size increasing would be small. And processing
> >> > time
> >> > > > could be done faster.
> >> > > > > > >>
> >> > > > > > >> this is just an idea and could be discussed separately from the
> >> > > > problem of integrating of D74169.
> >> > > > > >
> >> > > > > > >> >> 6. -flto=thin
> >> > > > > >
> >> > > > > > >> >> That problem was described in this review
> >> > > >
> >> > https://urldefense.com/v3/__https://reviews.llvm.org/D54747*1503720__;Iw!!
> >> > > >
> >> > JmoZiZGBv3RvKRSx!q8U1OiuTHDnORPTzJINrJOLwncHMDEAyE45t99RrMdkDdSYLjh78mgJen
> >> > > > L-N0pxHMQ$ . It also exists in
> >> > > > > >
> >> > > > > > >> >> current DWARFLinker/dsymutil implementation. I think that
> >> > > > problem should be discussed more: it could
> >> > > > > >
> >> > > > > > >> >> probably be fixed by avoiding generation of such incomplete
> >> > > > declaration during thinlto,
> >> > > > > >
> >> > > > > > >> >> That would be costly to produce extra/redundant debug info
> >> > in
> >> > > > ThinLTO - actually ThinLTO could be doing
> >> > > > > >
> >> > > > > > >> >> more to reduce that redundancy early on (actually removing
> >> > > > definitions from some llvm Modules if the type
> >> > > > > >
> >> > > > > > >> >> definition is known to exist in another Module, etc)
> >> > > > > > >> >I don't know if it's a problem since that patch was reverted.
> >> > > > > >
> >> > > > > > >>
> >> > > > > >
> >> > > > > > >> Yes. That patch was reverted, but this patch(D74169) has the
> >> > same
> >> > > > problem.
> >> > > > > >
> >> > > > > > >> if D74169 would be applied and --gc-debuginfo used then
> >> > structure
> >> > > > type
> >> > > > > > >> definition would be removed.
> >> > > > > >
> >> > > > > > >> DWARFLinker could handle that case - "removing definitions from
> >> > > > some llvm Modules if the type
> >> > > > > > >> definition is known to exist in another Module".
> >> > > > > > >> i.e. DWARFLinker could replace the declaration with the
> >> > definition.
> >> > > > > >
> >> > > > > > >> But that problem could be more easily resolved when debug info
> >> > is
> >> > > > generated(probably without
> >> > > > > > >> significant increase of debug info size):
> >> > > > > >
> >> > > > > > >> Here we have:
> >> > > > > >
> >> > > > > > >> DW_TAG_compile_unit(0x0000000b) - compile unit containing
> >> > concrete
> >> > > > instance for function "f".
> >> > > > > > >> DW_TAG_compile_unit(0x00000073) - compile unit containing
> >> > abstract
> >> > > > instance root for function "f".
> >> > > > > > >> DW_TAG_compile_unit(0x000000c1) - compile unit containing
> >> > function
> >> > > > "f" definition.
> >> > > > > >
> >> > > > > > >> Code for function "f" was deleted. gc-debuginfo deletes compile
> >> > > > unit DW_TAG_compile_unit(0x000000c1)
> >> > > > > > >> containing "f" definition (since there is no corresponding
> >> > code).
> >> > > > But it has structure "Foo" definition
> >> > > > > > >> DW_TAG_structure_type(0x0000011e) referenced from
> >> > > > DW_TAG_compile_unit(0x00000073)
> >> > > > > > >> by declaration DW_TAG_structure_type(0x000000ae). That
> >> > declaration
> >> > > > is exactly the case when definition
> >> > > > > > >> was removed by thinlto and replaced with declaration.
> >> > > > > >
> >> > > > > > >> Would it cost too much if type definition would not be replaced
> >> > > > with declaration for "abstract instance root"?
> >> > > > > > >> The number of concrete instances is bigger than number of
> >> > abstract
> >> > > > instance roots.
> >> > > > > > >> Probably, it would not be too costly to leave definition in
> >> > > > abstract instance root?
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > >> Alternatively, Would it cost too much if type definition would
> >> > not
> >> > > > be replaced with declaration when
> >> > > > > > >> declaration references type from not used function? (lto could
> >> > > > understand that concrete function is not used).
> >> > > > > >
> >> > > > > >
> >> > > > > > >I don't follow this example - could you provide a small concrete
> >> > test
> >> > > > case I could reproduce?
> >> > > > > >
> >> > > > > > I would provide a test case if necessary. But it looks like this
> >> > issue
> >> > > > is finally clear, and you already commented on that.
> >> > > > > >
> >> > > > > > > Oh, I guess this is happening perhaps because ThinLTO can't know
> >> > for
> >> > > > sure that a standalone
> >> > > > > > > definition of 'f' won't be needed - so it produces one in case
> >> > one
> >> > > > of the inlining opportunities
> >> > > > > > > doesn't end up inlining. Then it turns out all calls got
> >> > inlined, so
> >> > > > the external definition wasn't needed.
> >> > > > > >
> >> > > > > > > Oh, you're suggesting that these 3 CUs got emitted into one
> >> > object
> >> > > > file during LTO, but that DWARFLinker
> >> > > > > > > drops a CU without any code in it - even though... So far as I
> >> > know,
> >> > > > in LTO, LLVM directly references
> >> > > > > > > types across units if the CUs are all emitted in the same object
> >> > > > file. (and if they weren't in the same
> >> > > > > > > object file - then the abstract_origin couldn't be pointing
> >> > cross-
> >> > > > CU).
> >> > > > > >
> >> > > > > > > I guess some basic things to say:
> >> > > > > >
> >> > > > > > > With ThinLTO, the concrete/standalone function definition is
> >> > emitted
> >> > > > in case some call sites don't end up
> >> > > > > > > being inlined. So we know it'll be emitted (but might not be
> >> > needed
> >> > > > by the actual linker)
> >> > > > > > > ANy number of inline calls might exist - but we shouldn't put
> >> > the
> >> > > > type information into those, because
> >> > > > > > > they aren't guaranteed to emit it (if the inline function gets
> >> > > > optimized away, there would be nothing to
> >> > > > > > > enforce the type being emitted) - and even if we forced the type
> >> > > > information to be emitted into one
> >> > > > > > > object file that has an inline copy of the function - there's no
> >> > > > guarantee that object file will get linked in either.
> >> > > > > >
> >> > > > > > > So, no, I don't think there's much we can do to keep the size of
> >> > > > object files down, while guaranteeing
> >> > > > > > > the type information will be emitted with the usual linker
> >> > > > semantics.
> >> > > > > >
> >> > > > > > Then dsymutil/DWARFLinker could be changed to handle that(though
> >> > it
> >> > > > would probably be not very efficient).
> >> > > > > > If thinlto would understand that function is not used finally(and
> >> > then
> >> > > > must not contain referenced type definition),
> >> > > > > > then this situation could be handled more effectively.
> >> > > > > >
> >> > > > > > Thank you, Alexey.
> >> > > > > >
> >> > > > > >>>
> >> > > > > >>>
> >> > > > > >>>
> >> > > > > >>>
> >> > > > > >>> _______________________________________________
> >> > > > > >>> LLVM Developers mailing list
> >> > > > > >>> llvm-dev at lists.llvm.org
> >> > > > > >>> https://urldefense.com/v3/__https://lists.llvm.org/cgi-
> >> > > > bin/mailman/listinfo/llvm-
> >> > > >
> >> > dev__;!!JmoZiZGBv3RvKRSx!q8U1OiuTHDnORPTzJINrJOLwncHMDEAyE45t99RrMdkDdSYLj
> >> > > > h78mgJenL-Oh8zYPg$
> >> > > > > _______________________________________________
> >> > > > > LLVM Developers mailing list
> >> > > > > llvm-dev at lists.llvm.org
> >> > > > > https://urldefense.com/v3/__https://lists.llvm.org/cgi-
> >> > > > bin/mailman/listinfo/llvm-
> >> > > >
> >> > dev__;!!JmoZiZGBv3RvKRSx!q8U1OiuTHDnORPTzJINrJOLwncHMDEAyE45t99RrMdkDdSYLj
> >> > > > h78mgJenL-Oh8zYPg$
> >_______________________________________________
> >LLVM Developers mailing list
> >llvm-dev at lists.llvm.org
> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
More information about the llvm-dev
mailing list