[llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete debug info in lld.

Fri Jul 31 12:02:08 PDT 2020

Hi Alexey,

On Fri, Jul 31, 2020 at 4:02 AM Alexey Lapshin via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> On 28.07.2020 19:28, David Blaikie wrote:
> > On Tue, Jul 28, 2020 at 8:55 AM Alexey Lapshin <avl.lapshin at gmail.com>
> wrote:
> >>
> >> On 28.07.2020 10:29, David Blaikie via llvm-dev wrote:
> >>> On Fri, Jun 26, 2020 at 9:28 AM Alexey Lapshin
> >>> <alapshin at accesssoftek.com> wrote:
> >>>>>>>>>>>> This idea goes in another direction than fragmenting dwarf
> >>>>>>>>>>>> using elf sections&tricks. It seems to me that the cost of
> fragmenting is too high.
> >>>>>>>>>>> I tend to agree - but I'm sort of leaning towards trying to
> use object
> >>>>>>>>>>> features as much as possible, then implementing just enough
> custom
> >>>>>>>>>>> handling in the linker to recoup overhead, etc. (eg: add some
> kind of
> >>>>>>>>>>> small header/brief description that makes it easy for the
> linker to
> >>>>>>>>>>> slice-and-dice - but hopefully a domain-specific such header
> can be a
> >>>>>>>>>>> bit more compact than the fully general ELF form)
> >>>>>>>>>> I think this indeed should be implemented and evaluated.
> >>>>>>>>>> So that various approaches could be compared.
> >>>>>>>>>>
> >>>>>>>>>>>> It is not only the sizes of structures describing fragments
> but also the complexity
> >>>>>>>>>>>> of tools that should be taught to work with fragmented DWARF.
> >>>>>>>>>>>> (f.e. llvm-dwarfdump applied to object file should be able to
> read fragmented DWARF,
> >>>>>>>>>>>> but applied to linked executable it should work with
> non-fragmented DWARF).
> >>>>>>>>>>>> That idea is for the tool which works the same way as
> dsymutil ODR.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I will shortly describe the idea of making DWARF be easier
> processed by dsymutil/DWARFLinker:
> >>>>>>>>>>>>
> >>>>>>>>>>>> The idea is to have only one "type table" per object
> file(special section .debug_types_table).
> >>>>>>>>>>>> This "type table" would contain all types.
> >>>>>>>>>>>> There could be a special type of reference - type_offset -
> that offset points into the type table.
> >>>>>>>>>>>> Basic types could always be placed into the start of "type
> table" thus, offsets to basic types
> >>>>>>>>>>>> most often would be 1 byte. There also would be a special
> kind of reference - reference inside the type.
> >>>>>>>>>>>> Type units sig8 system - would not be used to reference types.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Types deduplication is assumed to be done, not by linker
> mechanism for COMDAT,
> >>>>>>>>>>>> but by a tool like dsymutil. This tool would create resulting
> .debug_types_table by putting there
> >>>>>>>>>>>> types from source .debug_types_table-s. Only one copy of the
> type would be placed into the
> >>>>>>>>>>>> resulting table. All references pointing to the deleted copy
> would be corrected to point
> >>>>>>>>>>>> to the single copy inside "type table". (that is how dsymutil
> works currently)
> >>>>>>>>>>> ^ that's the step that's probably a bit expensive for a
> general-use
> >>>>>>>>>>> tool - it implies parsing all the DWARF to find those
> references and
> >>>>>>>>>>> rewrite them, I think. For a high-performance solution that
> could be
> >>>>>>>>>>> run by the linker I think it'd be necessary to have a solution
> that
> >>>>>>>>>>> doesn't involve parsing all the DIEs.
> >>>>>>>>>> According to the current dsymutil processing,
> >>>>>>>>>> exactly this process is not the most time-consuming.
> >>>>>>>>>> That could be done relatively fast.
> >>>>>>>>> Fair enough - though I'd still imagine any solution that involves
> >>>>>>>>> parsing all the DIEs still wouldn't be fast enough (maybe an
> order of
> >>>>>>>>> magnitude faster than the current solution even - but that's
> stuill,
> >>>>>>>>> what, 6 or 7x slower than linking without the feature?) for most
> users
> >>>>>>>>> to consider it a good trade-off.
> >>>>>>>> It seems to me that even the current 6x-7x slowdown could be
> useful.
> >>>>>>>> Users who already use dsymutil or llvm-dwp(assuming DWARFLinker
> >>>>>>>> would be taught to work with a split dwarf) tools spend this time
> and,
> >>>>>>>> in some scenarios, waste disk space by inter-mediate files.
> >>>>>>> FWIW, dwp (llvm-dwp hasn't really been optimized compared to
> binutils
> >>>>>>> dwp) is designed to be very quick - by not needing to do a lot of
> >>>>>>> parsing/fixups. Which, yes, means larger output files than would be
> >>>>>>> possible with more parsing/etc. It also doesn't take any input from
> >>>>>>> the linker (so it can run in parallel with the linker) - so it
> can't
> >>>>>>> remove dead subprograms. Given Google's the major (perhaps only
> >>>>>>> significant?) user of Split DWARF - I can say that the needs don't
> >>>>>>> necessarily overlap well with something that would take
> significantly
> >>>>>>> longer to run or use significantly more memory. Faster/cheaper/with
> >>>>>>> somewhat bigger output files is probably the right tradeoff for
> >>>>>>> Google's use case, at least.
> >>>>>>>
> >>>>>>> I imagine Apple's use for dsymutil is somewhat similar - it's not
> used
> >>>>>>> in the iterative development cycle, only in final releases - well,
> >>>>>>> maybe their situation is more "neutral" - not a major pain point in
> >>>>>>> any case I'd guess.
> >>>>>>>
> >>>>>>>
> >>>>>> I see. FWIW, Comparison splitdwarf+dwp and DWARFLinker from lld:
> >>>>>>
> >>>>>> 1. split-dwarf+llvm-dwp = linking time for clang 6 sec,
> >>>>>>       generating time for .dwp 53 sec, clang=997M clang.dwp=1.1G.
> >>>>> FWIW, llvm-dwp is not very well optimized (which is to say: it is not
> >>>>> optimized), binutils dwp might be a better comparison (& even that
> >>>>> doesn't have the parallelism & some potential further memory savings
> >>>>> that lld has that we could take advantage of in a dwp-like tool)
> >>>>>
> >>>>> What build mode was the clang binary built in? Optimized or
> unoptimized?
> >>>> right, that is unoptimized build with -ffunction-sections.
> >>>>
> >>>>>> 2. DWARFLinker from lld = linking time for clang 72 sec, clang=760M.
> >>> And this is without Split DWARF? Without linker DWARF compression? -
> >>> that seems quite a bit surprising, that the deduplication of DWARF
> >>> could fit into less space than the wasted/reclaimed space in ranges (&
> >>> line)?
> >> that was without split dwarf, without linker compression.
> >>
> >>> Could you double check these numbers & provide a clearer summary?
> >> sure, I would re-check it.
> >>
> >>> Here's my attempt at numbers (all with
> function-sections+gc-sections)...
> >>>
> >>> Split DWARF tests didn't seem meaningful - gc-debuginfo + split DWARF
> >>> seemed to drop all the debug info (except gdb_index) so wasn't
> >>> working/comparison wasn't meaningful for Apples to Apples, but
> >>> included it for comparing gc'd non-split to non-gc'd split (disabled
> >>> gnu-pubnames/gdb-index (-gsplit-dwarf -gno-gnu-pubnames) (which turns
> >>> on by default with Split DWARF because gdb needs it - but a bit of an
> >>> unfair comparison without turning on gnu-pubnames/gdb-index in other
> >>> build modes too, since it... /shouldn't/ be necessary) which might've
> >>> been a factor in the data you were looking at)
> >> that might be the case. i.e. clang=997M for split dwarf(from my previous
> >> measurement) might include gnu-pubnames.
> >>
> >> would recheck it and if that is the case then it is a unfair comparison.
> >>
> >>
> >> My point was that "DWARFLinker from lld" takes less space than singleton
> >> split dwarf file+.dwp file.
> >>
> >> for -O0 uncompressed:
> >>
> >> - .dwp took 1.1G(if I built it correctly), singleton clang(from your
> >> measurements) 566 MB
> >>
> >>      overall 1.6G.
> > Oh, yeah, even if there are some measurement issues, linked executable
> > + .dwp is going to be larger than a linked executable using non-split
> > DWARF (in v5), since v5 uses all the same representations as non-split
> > DWARF, and split DWARF adds the indirection overhead of a split file,
> > etc.
> >
> > Even without DWARF linking, it's true that split DWARF has overhead
> > (dwp+executable will be larger than executable non-split).
> >
> > But maybe we've ended up down a bit of a tangent in any case.
> >
> > Trying to bring this back to "should this be committed to lld" seems
> > valuable, and I'm not sure what the right criteria are for that.
> I think it would be useful to do "removing obsolete debug info"
> in the linker. First thing is that it would be the fastest way(no need
> to copy data/create temp files/built address map...) Second thing
> is that it would be a good separation of concepts. All debug info
> processing, currently done in the linker(gdb_index, upcoming
> debug_names), could be moved into separate library processing
> debug info. When gdb_index/debug_names should be built without
> "removing of obsolete debug info" it would have the same
> performance results as it currently has.
>
> We decided to give the idea of "removing of obsolete debug info"
> another try and are going to implement it as a separate utility
> working with built binary. Making it to be multi-thread would
> probably show better performance results and then it could
> probably be considered as acceptable to use from the linker.
>
>
I'm quite interested in this direction. One thought I had was to
incorporate such a library into dsymutil but with support for ELF. If you
get a proposal written up I'd love to take a look and comment.

Thanks!

-eric

> Alexey.
>
> >
> > Ray's the best person to weigh in on that. My 2c is that I think it
> > probably is worthwhile, even just as an experiment, assuming it's not
> > too intrusive to lld.
> >
> >> - The "DWARFLinker from lld" 820 MB(from your measurements).
> >>
> >>
> >> So "DWARFLinker from lld" looks two times better.
> >>
> >>
> >> Anyway, thank you for pointing me to possible mistake. I would recheck
> >> it and update results.
> >>
> >>
> >> Alexey.
> >>
> >>
> >>> * -O0: (baseline, just using strip -g: 356 MB)
> >>>     * compressed: 25% smaller with gc-debuginfo (481 MB / 641 MB) (407
> >>> MB split/non-gc)
> >>>     * uncompressed: 30% smaller (820 MB / 1.2 GB) (566 MB split/non-gc)
> >>> * -O3: (baseline: 116 MB)
> >>>     * compressed: 16% smaller (361 MB / 462 MB) (283 MB split/non-gc)
> >>>     * uncompressed: 22% smaller (1022 MB / 1.2 GB) (156 MB
> split/non-gc)
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jun 26, 2020 at 9:28 AM Alexey Lapshin
> >>> <alapshin at accesssoftek.com> wrote:
> >>>>>>>>>>>> This idea goes in another direction than fragmenting dwarf
> >>>>>>>>>>>> using elf sections&tricks. It seems to me that the cost of
> fragmenting is too high.
> >>>>>>>>>>> I tend to agree - but I'm sort of leaning towards trying to
> use object
> >>>>>>>>>>> features as much as possible, then implementing just enough
> custom
> >>>>>>>>>>> handling in the linker to recoup overhead, etc. (eg: add some
> kind of
> >>>>>>>>>>> small header/brief description that makes it easy for the
> linker to
> >>>>>>>>>>> slice-and-dice - but hopefully a domain-specific such header
> can be a
> >>>>>>>>>>> bit more compact than the fully general ELF form)
> >>>>>>>>>> I think this indeed should be implemented and evaluated.
> >>>>>>>>>> So that various approaches could be compared.
> >>>>>>>>>>
> >>>>>>>>>>>> It is not only the sizes of structures describing fragments
> but also the complexity
> >>>>>>>>>>>> of tools that should be taught to work with fragmented DWARF.
> >>>>>>>>>>>> (f.e. llvm-dwarfdump applied to object file should be able to
> read fragmented DWARF,
> >>>>>>>>>>>> but applied to linked executable it should work with
> non-fragmented DWARF).
> >>>>>>>>>>>> That idea is for the tool which works the same way as
> dsymutil ODR.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I will shortly describe the idea of making DWARF be easier
> processed by dsymutil/DWARFLinker:
> >>>>>>>>>>>>
> >>>>>>>>>>>> The idea is to have only one "type table" per object
> file(special section .debug_types_table).
> >>>>>>>>>>>> This "type table" would contain all types.
> >>>>>>>>>>>> There could be a special type of reference - type_offset -
> that offset points into the type table.
> >>>>>>>>>>>> Basic types could always be placed into the start of "type
> table" thus, offsets to basic types
> >>>>>>>>>>>> most often would be 1 byte. There also would be a special
> kind of reference - reference inside the type.
> >>>>>>>>>>>> Type units sig8 system - would not be used to reference types.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Types deduplication is assumed to be done, not by linker
> mechanism for COMDAT,
> >>>>>>>>>>>> but by a tool like dsymutil. This tool would create resulting
> .debug_types_table by putting there
> >>>>>>>>>>>> types from source .debug_types_table-s. Only one copy of the
> type would be placed into the
> >>>>>>>>>>>> resulting table. All references pointing to the deleted copy
> would be corrected to point
> >>>>>>>>>>>> to the single copy inside "type table". (that is how dsymutil
> works currently)
> >>>>>>>>>>> ^ that's the step that's probably a bit expensive for a
> general-use
> >>>>>>>>>>> tool - it implies parsing all the DWARF to find those
> references and
> >>>>>>>>>>> rewrite them, I think. For a high-performance solution that
> could be
> >>>>>>>>>>> run by the linker I think it'd be necessary to have a solution
> that
> >>>>>>>>>>> doesn't involve parsing all the DIEs.
> >>>>>>>>>> According to the current dsymutil processing,
> >>>>>>>>>> exactly this process is not the most time-consuming.
> >>>>>>>>>> That could be done relatively fast.
> >>>>>>>>> Fair enough - though I'd still imagine any solution that involves
> >>>>>>>>> parsing all the DIEs still wouldn't be fast enough (maybe an
> order of
> >>>>>>>>> magnitude faster than the current solution even - but that's
> stuill,
> >>>>>>>>> what, 6 or 7x slower than linking without the feature?) for most
> users
> >>>>>>>>> to consider it a good trade-off.
> >>>>>>>> It seems to me that even the current 6x-7x slowdown could be
> useful.
> >>>>>>>> Users who already use dsymutil or llvm-dwp(assuming DWARFLinker
> >>>>>>>> would be taught to work with a split dwarf) tools spend this time
> and,
> >>>>>>>> in some scenarios, waste disk space by inter-mediate files.
> >>>>>>> FWIW, dwp (llvm-dwp hasn't really been optimized compared to
> binutils
> >>>>>>> dwp) is designed to be very quick - by not needing to do a lot of
> >>>>>>> parsing/fixups. Which, yes, means larger output files than would be
> >>>>>>> possible with more parsing/etc. It also doesn't take any input from
> >>>>>>> the linker (so it can run in parallel with the linker) - so it
> can't
> >>>>>>> remove dead subprograms. Given Google's the major (perhaps only
> >>>>>>> significant?) user of Split DWARF - I can say that the needs don't
> >>>>>>> necessarily overlap well with something that would take
> significantly
> >>>>>>> longer to run or use significantly more memory. Faster/cheaper/with
> >>>>>>> somewhat bigger output files is probably the right tradeoff for
> >>>>>>> Google's use case, at least.
> >>>>>>>
> >>>>>>> I imagine Apple's use for dsymutil is somewhat similar - it's not
> used
> >>>>>>> in the iterative development cycle, only in final releases - well,
> >>>>>>> maybe their situation is more "neutral" - not a major pain point in
> >>>>>>> any case I'd guess.
> >>>>>>>
> >>>>>>>
> >>>>>> I see. FWIW, Comparison splitdwarf+dwp and DWARFLinker from lld:
> >>>>>>
> >>>>>> 1. split-dwarf+llvm-dwp = linking time for clang 6 sec,
> >>>>>>       generating time for .dwp 53 sec, clang=997M clang.dwp=1.1G.
> >>>>> FWIW, llvm-dwp is not very well optimized (which is to say: it is not
> >>>>> optimized), binutils dwp might be a better comparison (& even that
> >>>>> doesn't have the parallelism & some potential further memory savings
> >>>>> that lld has that we could take advantage of in a dwp-like tool)
> >>>>>
> >>>>> What build mode was the clang binary built in? Optimized or
> unoptimized?
> >>>> right, that is unoptimized build with -ffunction-sections.
> >>>>
> >>>>>> 2. DWARFLinker from lld = linking time for clang 72 sec, clang=760M.
> >>>>> It does seem a tad strange that the clang binary would be smaller
> >>>>> non-split with DWARF linking than it was split. Though I could
> imagine
> >>>>> this might be possible in an optimized build (wehre debug_ranges
> >>>>> become quite relatively expensive in the .o file contribution with
> >>>>> Split DWARF)
> >>>>> Could you compare the section sizes between these two clang
> binaries, perhaps?
> >>>> .debug_ranges is three times bigger and .debug_line is twice bigger.
> >>>>
> >>>>>>>> Thus if they would use this LLD feature in its current state
> >>>>>>>> - they would still receive benefits.
> >>>>>>>>
> >>>>>>>> Speaking of performance results - LLD is a multi-thread linker;
> >>>>>>>> it handles sections in parallel. DWARFLinker generates DWARF using
> >>>>>>>> AsmPrinter which is a stream - so it could make resulting DWARF
> only
> >>>>>>>> continuously. It is not surprising that the parallel solution
> works faster.
> >>>>>>>> Making DWARFLinker truly multi-threaded would probably allow us
> >>>>>>>> to make slowdown to be at 2x-4x range.
> >>>>>>> *nod* that's still a really expensive link - but I understand
> that's a
> >>>>>>> suitable tradeoff for your users
> >>>>>>>
> >>>>>> Btw, 2x or 7x is for pure linking time. Overall compilation slowdown
> >>>>>> is not so significant. Building LLVM codebase has only 20% slowdown.
> >>>>> Understood - that's still quite significant to most users, I'd
> imagine.
> >>>> I see.
> >>>>
> >>>>>>>>>> Anyway, I think the dsymutil approach is still valuable, and it
> >>>>>>>>>> would be useful to optimize it.
> >>>>>>>>>> Do you think it would be useful to make dsymutil/DWARFLinker
> truly multi-thread?
> >>>>>>>>>> (To make dsymutil/DWARFLinker able to process each object file
> in a separate thread)
> >>>>>>>>> Perhaps - that I'd probably leave up to the folks who are more
> >>>>>>>>> invested in dsymutil (Adrian Prantl et al). Maybe one day we'll
> get it
> >>>>>>>>> integrated into llvm-dwp and then I'll be interested in getting
> as
> >>>>>>>>> much performance out of it as lld - so multithreading and things
> would
> >>>>>>>>> be on the books.
> >>>>>>>> I think improving dsymutil is a valuable thing.
> >>>>>>>> Though there are several directions which might be considered
> >>>>>>>> to make it more robust:
> >>>>>>>>
> >>>>>>>> 1. support of latest DWARF - DWARF5/DWARF64...
> >>>>>>> I expect/though some of the Apple folks had already worked on
> DWARF5 support?
> >>>>>>> DWARF64 - that's been around for a while, and just hasn't been
> needed
> >>>>>>> by LLVM users thus far, it seems (until recently - where some
> >>>>>>> developers have started working on that)
> >>>>>> There already implemented debug_names table, but debug_rnglists,
> >>>>>> debug_loclists, type units - are not implemented yet.
> >>>>> Superficially, type units wouldn't be on the list of features (like
> >>>>> DWARF64 - it's optional) I'd try to support in dsymutil - since their
> >>>>> size overhead is more justified for a DWARF-agnostic linker that's
> >>>>> using comdat groups. With a DWARF-aware linker I'd be specifically
> >>>>> hoping to avoid using type units to help
> >>>>>> The thing which
> >>>>>> should probably be changed is that dsymutil should not have its
> version
> >>>>>> of code generating DWARF tables. It should call already existed
> >>>>>> DWARF5/DWARF64 implementations. Then dsymutil would always
> >>>>>> use last DWARF generators.
> >>>>> Possibly - I don't know what the architectural tradeoffs for that
> look
> >>>>> like - I'd imagine DWARFLinker has sufficiently different
> >>>>> needs/tradeoffs than LLVM's DWARF generation code (rewriting existing
> >>>>> DIEs compared to building new ones from scratch, etc) that it might
> be
> >>>>> hard for them to share a lot of their implementation.
> >>>> It is not easy, and would require some additions, but it would benefit
> >>>> in that all format implementation is in one place. Thus changing that
> place
> >>>> would reflect in other places. There are at least three
> implementations for
> >>>> .debug_ranges, .debug_aranges currently...
> >>>>
> >>>>
> >>>>>>>> 2. implement multi-threaded execution.
> >>>>>>>> 3. support of split DWARF.
> >>>>>>> Maybe, though I'm still not sure it'd be the right tradeoff -
> >>>>>>> especially if it involved having to wait to run the .dwo merger
> (call
> >>>>>>> it DWARF-aware dwp, or dsymutil with dwp support) until after the
> >>>>>>> linker ran.
> >>>>>>>
> >>>>>>>> 4. implement dsymutil for non-darwin platform.
> >>>>>>> That's probably, essentially (3), more-or-less. Split DWARF is
> >>>>>>> somewhat of a formalization of Apple's/MachO DWARF distribution
> model
> >>>>>>> (leave DWARF it in files that aren't linked/use them from a
> debugger,
> >>>>>>> but also be able to merge them into some final file (dsym or dwp)
> for
> >>>>>>> archival purposes)
> >>>>>>>
> >>>>>>>> All of this is a massive piece of work.
> >>>>>>>> Our original investment was to solve two problems:
> >>>>>>>>
> >>>>>>>> 1. Overlapped address ranges, which is currently close to being
> solved. Thank you for helping with that!
> >>>>>>> Yeah, again, sorry that's taken quite so long/somewhat circuitous
> route.
> >>>>>>>
> >>>>>>>> 2. Size of debug info. That still becomes an issue, but we are
> unsure whether we are ready to
> >>>>>>>>      invest in solving all the above 1-4 problems and how much
> community interested in it.
> >>>>>>> Fair, for sure - I don't think you'd need to sign up to solve all
> of
> >>>>>>> them (don't think they necessarily need solving). Potentially
> moving
> >>>>>>> the logic out into a separate tool as Fangrui's considering - a
> >>>>>>> post-link DWARF optimizer, rather than in-linker DWARF
> optimization.
> >>>>>>>
> >>>>>>> I really don't want to give you the runaround like this - but
> multiple
> >>>>>>> times slower links is something that seems pretty problematic for
> most
> >>>>>>> users, to the point of weighing the maintainability of lld against
> the
> >>>>>>> convenience of having this functionality in-linker rather than in a
> >>>>>>> post-link optimizer.
> >>>>>>>
> >>>>>>> (I know you've spoken a bit before about your users needs - but if
> >>>>>>> it's possible, could you explain (again :/) why they have such a
> >>>>>>> strong need for smaller DWARF? While DWARF size is an ongoing
> concern
> >>>>>>> for many users (Google certainly - hence the invention of Split
> DWARF,
> >>>>>>> use of type units and compressed DWARF, etc) - usually it's in
> rather
> >>>>>>> large programs, but it sounds like you're dealing with relatively
> >>>>>>> small ones (otherwise the increase in link time, I'd imagine,
> would be
> >>>>>>> prohibitive for your users?)?
> >>>>>> We have many large programs and keep Dayly/Nightly debug builds,
> >>>>>> which takes a lot of disk space. Compilation time for these
> programs is big.
> >>>>>> The scenario is "compile once".(not compile-debug-compile-debug).
> >>>>>> So we think that solution(like dsymutil/DWARFLinker) would not
> slowdown
> >>>>>> the compilation time of overall build significantly(see above
> numbers for
> >>>>>> llvm codebase) and would allow us to reduce disk space required to
> keep
> >>>>>> all of these builds.
> >>>>> Ah, OK - for archival purposes. So the interactive developers
> wouldn't
> >>>>> necessarily be using this feature. Makes sense - similar to dsymutil
> >>>>> and dwp, mostly used for archival purposes & you can debug straight
> >>>> >from .o/.dwos for interactive/iterative development.
> >>>>
> >>>>> In that case, it seems more likely that a separate tool might
> suffice.
> >>>> agreed: if to continue the work on this then it makes sense to
> >>>> do it as separate tool. Make it fast enough. And if there would be
> interest
> >>>> in it - then it would probably be possible to return to idea calling
> it from linker.
> >>>>
> >>>>> Also, out of curiosity - have you tried just compressing the output
> >>>>> (-gz (I think that does the right thing for the linker level
> >>>>> compression too, otherwise -Wl,-compress-debug-sections might do it))
> >>>>> or are you already doing that in addition?
> >>>> sure. we use  -Wl,-compress-debug-sections.
> >>>>
> >>>> Thank you, Alexey.
> >>>>
> >>>>>>> You mentioned that the usability cost of
> >>>>>>> Split DWARF for your users was too high (or high enough to justify
> >>>>>>> this alternative work of DWARF-aware linking)? That all seems a bit
> >>>>>>> surprising to me - though I understand the deployment issues of
> Split
> >>>>>>> DWARF do present some challenges to users in more heterogenous
> >>>>>>> environments than Google's... still, I'd have thought there was
> some
> >>>>>>> hope there)
> >>>>>> Our tools does not support split dwarf yet. Though we plan to
> implement it.
> >>>>>> When we would have support of split dwarf then it would be
> >>>>>> convenient to have easy way to share built debug binaries. llvm-dwp
> is the
> >>>>>> answer to this. DWARFLinker could probably be another answer.
> >>>>> Ah, fair enough - thanks for the context!
> >>>>>>>>> One way to do that would be to have a CU-local type indirection
> table.
> >>>>>>>>> DIEs reference local type numbers (like local address/string
> numbers -
> >>>>>>>>> addrx/strx/rnglistx) and that table contains either sig8 (no
> linker
> >>>>>>>>> fixups required) or the local type offsets you describe - the
> linker
> >>>>>>>>> would then only need to read this type number indirection table
> and
> >>>>>>>>> rewrite them to the final type numbers.
> >>>>>>>> Yes, that could be additionally done if this process would be
> time-consuming.
> >>>>>>>>
> >>>>>>>> David, thank you for all your comments and explanations. They are
> extremely helpful.
> >>>>>>> Sure thing - really appreciate your patience with all this -
> it's... a
> >>>>>>> lot of moving parts.
> >>>>>>> - Dave
> >>>>>>> Thank you, Alexey.
> >>>>>>>
> >>>>>>>> sig8 hash-id would be used to compare types and to deduplicate
> them.
> >>>>>>>> It would speed up the current dsymutil context analysis.
> >>>>>>>> Types having the same hash-id could be deduplicated.
> >>>>>>>> This would allow deduplicating a more number of types than
> current dsymutil.
> >>>>>>>> Incomplete type definitions having a similar set of members are
> not deduplicated by dsymutil currently.
> >>>>>>>> In this case they would have the same hash-id.
> >>>>>>>>
> >>>>>>>> This "type table" would take less space than current "type units"
> and current ODR solution.
> >>>>>>>>
> >>>>>>>> Above is just an idea on how to help DWARF-aware linker(based on
> idea removing obsolete debug info)
> >>>>>>>> to work faster(if that is interesting).
> >>>>>>>>
> >>>>>>>> Alexey.
> >>>>>>>>
> >>>>>>>>> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
> James Henderson via llvm-dev
> >>>>>>>>> Sent: Wednesday, June 3, 2020 3:48 AM
> >>>>>>>>> To: David Blaikie <dblaikie at gmail.com>
> >>>>>>>>> Cc: llvm-dev at lists.llvm.org
> >>>>>>>>> Subject: Re: [llvm-dev] [Debuginfo][DWARF][LLD] Remove obsolete
> debug info in lld.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> It makes me sad that the linker (via a library or otherwise) has
> to be "DWARF-aware" to be able to effectively handle --gc-sections,
> COMDATs, --icf etc for debug info, without leaving large blocks of data
> kicking around.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> The patching to -1 (or equivalent) is probably a good
> lightweight solution (though I'd love it if it could be done based on
> section type in the future rather than section name, but that's probably
> outside the realm of DWARF), as it requires only minimal understanding in
> the linker, but anything beyond that seems to be complicated logic that is
> mostly due to the structure of DWARF. Patching to -1 does feel a bit like a
> sticking plaster/band aid to patch over the issue rather than properly
> solving it too - there will still be debug data (potentially significant
> amounts in COMDAT-heavy objects) that the linker has to write and the
> debugger has to somehow know how to skip (even if it knows that -1 is
> special-case due to the standard being updated, it needs to get as far as
> the -1), which is all wasted effort.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> We've already seen from Alexey's prototyping, and from our own
> experiences with the Sony proprietary linker (which tried to rewrite
> .debug_line only) that deconstructing the DWARF so that it can be more
> optimally reassembled at link time is slow going, and will probably
> inevitably be however much effort is put into optimising it. For a start,
> given the current standards, it's impossible to know how to deconstruct it
> without having to parse vast amounts of DWARF, which is typically going to
> mean a lot more parsing work than the linker would normally have to deal
> with. Additionally, much of this parsing work is wasted effort, since it
> seems unlikely in many links that large amounts of the DWARF will be
> redundant. Having an option to opt-in doesn't help much there, since it
> just means the logic exists without most people using it, due to it not
> being good enough, or potentially they don't even know it exists.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I don't have particularly concrete suggestions as to how to
> solve the structural problems with DWARF at this point. The only thing that
> seems obvious to me is a more "blessed" approach to fragmentation of
> sections, similar to what I tried with my prototype mentioned earlier in
> the thread, although we'd need to figure out the previously stated
> performance issues. Other ideas might tie into this, like somehow sharing
> the various table headers a bit like CIEs in .eh_frame that could be merged
> by the linker - each object could have separate table header sections,
> which are referenced by the individual .debug_* blocks, which in turn are
> one per function/data piece and easily discardable/merged by the linker.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Just some thoughts.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> James
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, 2 Jun 2020 at 19:24, David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >>>>>>>>>
> >>>>>>>>> On Tue, May 19, 2020 at 7:17 AM Alexey Lapshin
> >>>>>>>>> <alapshin at accesssoftek.com> wrote:
> >>>>>>>>>> Hi David, please find my comments inside:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>>> Broad question: Do you have any specific
> motivation/users/etc in implementing this (if you can speak about it)?
> >>>>>>>>>>>>> - it might help motivate the work, understand what tradeoffs
> might be suitable for you/your users, etc.
> >>>>>>>>>>>> There are two general requirements:
> >>>>>>>>>>>> 1) Remove (or clean) invalid debug info.
> >>>>>>>>>>> Perhaps a simpler direct solution for your immediate needs
> might be a much narrower,
> >>>>>>>>>>> and more efficient linker-DWARF-awareness feature:
> >>>>>>>>>>>
> >>>>>>>>>>> With DWARFv5, rnglists present an opportunity for a DWARF
> linker to rewrite the ranges
> >>>>>>>>>>> without parsing the rest of the DWARF. /technically/ this
> isn't guaranteed - rnglist entries
> >>>>>>>>>>> can be referenced either directly, or by index. If all
> rnglists are referenced by index, then
> >>>>>>>>>>> a linker could parse only the debug_rnglists section and
> rewrite ranges to remove any
> >>>>>>>>>>> address ranges that refer to optimized-out code.
> >>>>>>>>>>>
> >>>>>>>>>>> This would only be correct for rnglists that had no direct
> references to them (that only were
> >>>>>>>>>>> referenced via the indexes) - but we could either implement it
> with that assumption, or could
> >>>>>>>>>>> add an LLVM extension attribute on the CU that would say "I
> promise I only referenced rnglists
> >>>>>>>>>>> via rnglistx forms/indexes). If this DWARF-aware linking would
> have to read the CU DIE (not
> >>>>>>>>>>> all the other DIEs) it /could/ also then rewrite high/low_pc
> if the CU wasn't using ranges...
> >>>>>>>>>>> but that wouldn't come up in the function-removal case,
> because then you'd have ranges anyway,
> >>>>>>>>>>> so no need for that.
> >>>>>>>>>>>
> >>>>>>>>>>> Such a DWARF-aware rnglist linking could also simplify
> rnglists, in cases where functions
> >>>>>>>>>>> ended up being laid out next to each other, the linker could
> coalesce their ranges together.
> >>>>>>>>>>>
> >>>>>>>>>>> I imagine this could be implemented with very little overhead
> to linking, especially compared
> >>>>>>>>>>> to the overhead of full DWARF-aware linking.
> >>>>>>>>>>>
> >>>>>>>>>>> Though none of this fixes Split DWARF, where the linker
> doesn't get a chance to see the
> >>>>>>>>>>> addresses being used - but if you only want/need the CU-level
> ranges to be correct, this
> >>>>>>>>>>> might be a viable fix, and quite efficient.
> >>>>>>>>>> Yes, we think about that alternative. This would resolve our
> problem of invalid debug info
> >>>>>>>>>> and would work much faster. Thus, if we would not have good
> results for D74169 then we
> >>>>>>>>>> will implement it. Do you think it could be useful to have this
> solution in upstream?
> >>>>>>>>> A pure rnglist rewriting - I think it'd be OK to have in
> upstream -
> >>>>>>>>> again, cost/benefit/etc would have to be weighed. I'm not sure it
> >>>>>>>>> would save enough space to be particularly valuable beyond the
> >>>>>>>>> correctness issue - and it doesn't completely solve the
> correctness
> >>>>>>>>> issue for zero-address usage or low-address usage (because you
> could
> >>>>>>>>> still have overlapping subprograms inside a CU - so if you were
> >>>>>>>>> symbolizing you could use the correct rnglist to filter, but
> then go
> >>>>>>>>> look inside the CU only to find two subprograms that had that
> address
> >>>>>>>>> & not know which one was the correct one an which one was the
> >>>>>>>>> discarded one).
> >>>>>>>>>
> >>>>>>>>> rnglist rewriting might be easy enough to prototype - but
> depends what
> >>>>>>>>> you want to spend your time on, I know this whole issue has been
> a
> >>>>>>>>> huge investment of your time already - but maybe this recent
> >>>>>>>>> revitalization of the conversation around having an explicit
> value in
> >>>>>>>>> the linker might be sufficient to address everyone's needs...
> *fingers
> >>>>>>>>> crossed*)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>> 2) Optimize the DWARF size.
> >>>>>>>>>>> Do your users care much about this? I imagine if they had
> significant DWARF size issues,
> >>>>>>>>>>> they'd have significant link time issues and the kind of cost
> to link time this feature has would
> >>>>>>>>>>> be prohibitive - but perhaps they're sharing linked binaries
> much more often than they're
> >>>>>>>>>>> actually performing linking.
> >>>>>>>>>> Yes, they do. They also have significant link-time issues.
> >>>>>>>>>> So current performance results of D74169 are not very
> acceptable.
> >>>>>>>>>> We hope to improve it.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>> The specifics which our users have:
> >>>>>>>>>>>>    - embedded platform which uses 0 as start of .text section.
> >>>>>>>>>>>>    - custom toolset which does not support all features
> yet(f.e. split dwarf).
> >>>>>>>>>>>>    - tolerant of the link-time increase.
> >>>>>>>>>>>>    - need a useful way to share debug builds.
> >>>>>>>>>>> Sharing two files (executable and dwp) is significantly less
> useful than sharing one file?
> >>>>>>>>>> Probably not significantly, but yes, it looks less useful
> comparing to D74169.
> >>>>>>>>>> Having only two files (executable and .dwp) looks significantly
> better than having executable and multiple .dwo files.
> >>>>>>>>>> Having only one file(executable) with minimal size looks better
> than the two files with a bigger size.
> >>>>>>>>>>
> >>>>>>>>>> clang compiled with -gsplitdwarf takes 0.9G for executable and
> 0.9G for .dwp.
> >>>>>>>>>> clang compiled with -gc-debuginfo takes only 0.76G for single
> executable.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>> For the first point: we have a problem "Overlapping address
> ranges starting from 0"(D59553).
> >>>>>>>>>>>> We use custom solution, but the general solution like D74169
> would be better here.
> >>>>>>>>>>> If CU ranges are the only ones that need fixing, then I think
> the above solution might be as
> >>>>>>>>>>> good/better - if more than CU ranges need fixing, then I think
> we might want to start talking about
> >>>>>>>>>>> how to fix DWARF itself (split and non-split) to signal
> certain addresses point to dead code with a
> >>>>>>>>>>> specific blessed value that linkers would need to implement -
> because with Split DWARF there's
> >>>>>>>>>>> no way to solve the non-CU addresses at the linker.
> >>>>>>>>>> I think the worthful solution for that signal value would be
> LowPC > HighPC.
> >>>>>>>>>> That does not require additional bits in DWARF.
> >>>>>>>>>> It would be natural to skip such address ranges since they
> explicitly marked as invalid.
> >>>>>>>>>> It could be implemented in a linker very easily. Probably, it
> would make sense to describe that
> >>>>>>>>>> usage in DWARF standard.
> >>>>>>>>>>
> >>>>>>>>>> As to the addresses which are not seen by the linker(since they
> are in .dwo files) - yes,
> >>>>>>>>>> they need to have another solution. Could you show an example
> of such a case, please?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>>> 2. Support of type units.
> >>>>>>>>>>>>>>    That could be implemented further.
> >>>>>>>>>>>>> Enabling type units increases object size to make it easier
> to deduplicate at link time by a DWARF-unaware
> >>>>>>>>>>>>> linker. With a DWARF aware linker it'd be generally
> desirable not to have to add that object size overhead to
> >>>>>>>>>>>>> get the linking improvements.
> >>>>>>>>>>>> But, DWARFLinker should adequately work with type units since
> they are already implemented.
> >>>>>>>>>>> Maybe - it'd be nice & all, but I don't think it's an outright
> necessity - if someone knows they're using
> >>>>>>>>>>> a DWARF-aware linker, they'd probably not use type units in
> their object files. It's possible someone
> >>>>>>>>>>> doesn't know for sure & maybe they have pre-canned debug
> object files from someone else, etc.
> >>>>>>>>>> I see.
> >>>>>>>>>>
> >>>>>>>>>>>> Another thing is that the idea behind type units has the
> potential to help Dwarf-aware linker to work faster.
> >>>>>>>>>>>> Currently, DWARFLinker analyzes context to understand whether
> types are the same or not.
> >>>>>>>>>>> When you say "analyzes context" what do you mean? Usually I'd
> take that to mean
> >>>>>>>>>>> "looks at things outside the type itself - like what namespace
> it's in, etc" - which, yes,
> >>>>>>>>>>> it should do that, but it doesn't seem very expensive to do.
> But I guess you actually
> >>>>>>>>>>> mean something about doing structural equivalence in some way,
> looking at things inside the type?
> >>>>>>>>>> I think it could be useful for both cases. Currently, dsymutil
> does only first thing
> >>>>>>>>>> (look at type name, namespace name, etc..) and does not do the
> second thing
> >>>>>>>>>> (doing structural equivalence). Analyzing type names is
> currently quite expensive
> >>>>>>>>>> (the only search in string pool takes ~10 sec from 70 sec of
> overall time).
> >>>>>>>>>> That is expensive because of many things should be done to work
> with strings:
> >>>>>>>>>> parse DWARF, search and resolve relocations, compute a hash for
> strings,
> >>>>>>>>>> put data into a string pool, create a fully qualified name(like
> namespace::function::name).
> >>>>>>>>>> It looks like it could be optimized and finally require less
> time, but it still would be a noticeable
> >>>>>>>>>> part of the overall time.
> >>>>>>>>>>
> >>>>>>>>>> If dsymutil starts to check for the structural equivalence,
> then the process would be even more slowly.
> >>>>>>>>>> So, If instead of comparing types structure, there would be
> checked single hash-id - then this process
> >>>>>>>>>> would also be faster.
> >>>>>>>>>>
> >>>>>>>>>> Thus I think using hash-id to compare types would allow to make
> current implementation faster and would
> >>>>>>>>>> allow handling incomplete types by DWARFLinker without massive
> performance degradation also.
> >>>>>>>>>>
> >>>>>>>>>>>> But the context is known when types are generated. So, no
> need to spent the time analyzing it.
> >>>>>>>>>>>> If types could be compared without analyzing context, then
> Dwarf-aware linker would work faster.
> >>>>>>>>>>>> That is just an idea(not for immediate implementation): If
> types would be stored in some "type table"
> >>>>>>>>>>>> (instead of COMDAT section group) and could be accessed
> through hash-id(like type units
> >>>>>>>>>>>> - then it would be the solution requiring fewer bits to store
> but allowing to compare types
> >>>>>>>>>>>> by hash-id(not analysing context).
> >>>>>>>>>>>> In this case, size increasing would be small. And processing
> time could be done faster.
> >>>>>>>>>>>>
> >>>>>>>>>>>> this is just an idea and could be discussed separately from
> the problem of integrating of D74169.
> >>>>>>>>>>>>>> 6. -flto=thin
> >>>>>>>>>>>>>>      That problem was described in this review
> https://reviews.llvm.org/D54747#1503720. It also exists in
> >>>>>>>>>>>>>> current DWARFLinker/dsymutil implementation. I think that
> problem should be discussed more: it could
> >>>>>>>>>>>>>> probably be fixed by avoiding generation of such incomplete
> declaration during thinlto,
> >>>>>>>>>>>>>> That would be costly to produce extra/redundant debug info
> in ThinLTO - actually ThinLTO could be doing
> >>>>>>>>>>>>>> more to reduce that redundancy early on (actually removing
> definitions from some llvm Modules if the type
> >>>>>>>>>>>>>> definition is known to exist in another Module, etc)
> >>>>>>>>>>>>> I don't know if it's a problem since that patch was reverted.
> >>>>>>>>>>>> Yes. That patch was reverted, but this patch(D74169) has the
> same problem.
> >>>>>>>>>>>> if D74169 would be applied and --gc-debuginfo used then
> structure type
> >>>>>>>>>>>> definition would be removed.
> >>>>>>>>>>>> DWARFLinker could handle that case - "removing definitions
> from some llvm Modules if the type
> >>>>>>>>>>>> definition is known to exist in another Module".
> >>>>>>>>>>>> i.e. DWARFLinker could replace the declaration with the
> definition.
> >>>>>>>>>>>> But that problem could be more easily resolved when debug
> info is generated(probably without
> >>>>>>>>>>>> significant increase of debug info size):
> >>>>>>>>>>>> Here we have:
> >>>>>>>>>>>> DW_TAG_compile_unit(0x0000000b) - compile unit containing
> concrete instance for function "f".
> >>>>>>>>>>>> DW_TAG_compile_unit(0x00000073) - compile unit containing
> abstract instance root for function "f".
> >>>>>>>>>>>> DW_TAG_compile_unit(0x000000c1) - compile unit containing
> function "f" definition.
> >>>>>>>>>>>> Code for function "f" was deleted. gc-debuginfo deletes
> compile unit DW_TAG_compile_unit(0x000000c1)
> >>>>>>>>>>>> containing "f" definition (since there is no corresponding
> code). But it has structure "Foo" definition
> >>>>>>>>>>>> DW_TAG_structure_type(0x0000011e) referenced from
> DW_TAG_compile_unit(0x00000073)
> >>>>>>>>>>>> by declaration DW_TAG_structure_type(0x000000ae). That
> declaration is exactly the case when definition
> >>>>>>>>>>>> was removed by thinlto and replaced with declaration.
> >>>>>>>>>>>> Would it cost too much if type definition would not be
> replaced with declaration for "abstract instance root"?
> >>>>>>>>>>>> The number of concrete instances is bigger than number of
> abstract instance roots.
> >>>>>>>>>>>> Probably, it would not be too costly to leave definition in
> abstract instance root?
> >>>>>>>>>>
> >>>>>>>>>>>> Alternatively, Would it cost too much if type definition
> would not be replaced with declaration when
> >>>>>>>>>>>> declaration references type from not used function? (lto
> could understand that concrete function is not used).
> >>>>>>>>>>> I don't follow this example - could you provide a small
> concrete test case I could reproduce?
> >>>>>>>>>> I would provide a test case if necessary. But it looks like
> this issue is finally clear, and you already commented on that.
> >>>>>>>>>>
> >>>>>>>>>>> Oh, I guess this is happening perhaps because ThinLTO can't
> know for sure that a standalone
> >>>>>>>>>>> definition of 'f' won't be needed - so it produces one in case
> one of the inlining opportunities
> >>>>>>>>>>> doesn't end up inlining. Then it turns out all calls got
> inlined, so the external definition wasn't needed.
> >>>>>>>>>>> Oh, you're suggesting that these 3 CUs got emitted into one
> object file during LTO, but that DWARFLinker
> >>>>>>>>>>> drops a CU without any code in it - even though... So far as I
> know, in LTO, LLVM directly references
> >>>>>>>>>>> types across units if the CUs are all emitted in the same
> object file. (and if they weren't in the same
> >>>>>>>>>>> object file - then the abstract_origin couldn't be pointing
> cross-CU).
> >>>>>>>>>>> I guess some basic things to say:
> >>>>>>>>>>> With ThinLTO, the concrete/standalone function definition is
> emitted in case some call sites don't end up
> >>>>>>>>>>> being inlined. So we know it'll be emitted (but might not be
> needed by the actual linker)
> >>>>>>>>>>> ANy number of inline calls might exist - but we shouldn't put
> the type information into those, because
> >>>>>>>>>>> they aren't guaranteed to emit it (if the inline function gets
> optimized away, there would be nothing to
> >>>>>>>>>>> enforce the type being emitted) - and even if we forced the
> type information to be emitted into one
> >>>>>>>>>>> object file that has an inline copy of the function - there's
> no guarantee that object file will get linked in either.
> >>>>>>>>>>> So, no, I don't think there's much we can do to keep the size
> of object files down, while guaranteeing
> >>>>>>>>>>> the type information will be emitted with the usual linker
> semantics.
> >>>>>>>>>> Then dsymutil/DWARFLinker could be changed to handle
> that(though it would probably be not very efficient).
> >>>>>>>>>> If thinlto would understand that function is not used
> finally(and then must not contain referenced type definition),
> >>>>>>>>>> then this situation could be handled more effectively.
> >>>>>>>>>>
> >>>>>>>>>> Thank you, Alexey.
> >>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>> LLVM Developers mailing list
> >>>>>>>>>>>> llvm-dev at lists.llvm.org
> >>>>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>>>>>>>> _______________________________________________
> >>>>>>>>> LLVM Developers mailing list
> >>>>>>>>> llvm-dev at lists.llvm.org
> >>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> llvm-dev at lists.llvm.org
> >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200731/93217290/attachment-0001.html>