[PATCH] D42937: [DWARF] Make llvm-dwp handle DWARF v5 string offsets tables and indexed strings.

Tue Feb 27 10:02:23 PST 2018

On Tue, Feb 27, 2018 at 9:58 AM <wolfgang.pieb at sony.com> wrote:

> Currently we parse the CU headers even in a DWP+DWP merge to check for
> duplicates. So I don’t think it would be a regression.
>

Huh - wouldn't we be able to detect duplicates from only consulting the two
indexes? Rather than parsing the CUs?

>
>
> -- wolfgang
>
>
>
> *From:* David Blaikie [mailto:dblaikie at gmail.com]
> *Sent:* Tuesday, February 27, 2018 9:39 AM
> *To:* reviews+D42937+public+3fe0cbc66cf01928 at reviews.llvm.org
> *Cc:* Pieb, Wolfgang; aprantl at apple.com; jdevlieghere at apple.com;
> llvm-commits at lists.llvm.org; mgrang at codeaurora.org
> *Subject:* Re: [PATCH] D42937: [DWARF] Make llvm-dwp handle DWARF v5
> string offsets tables and indexed strings.
>
>
>
> Yeah, I'll just call it - I think the maintenance burden of
> committing/maintaining support for v4/v5 mixed str_offsets isn't worth it
> (but is worth it for llvm-dwarfdump - for investigation, etc) for llvm-dwp,
> etc.
>
> But I guess it's still a bit complicated - in a DWP+DWP merge, we'd still
> have to parse at least one unit header to decide which version to look for
> in str_offsets? (but that's still an improvement compared to having to read
> all the headers - which is a regression, right? Currently (without this
> patch) no headers need to be parsed when doing DWP+DWP, I think? (even
> though for any DWO file being merged in, we need to do lots of parsing to
> find its identifier, etc))
>
>
>
> On Mon, Feb 26, 2018 at 5:53 PM Paul Robinson via Phabricator <
> reviews at reviews.llvm.org> wrote:
>
> probinson added inline comments.
>
>
> ================
> Comment at: string_offsets_mixed.s:40-46
> +# With DWARF v5 we need to match up CUs with their corresponding string
> offsets table
> +# contributions (in order to properly process the contribution headers)
> and hence need
> +# to remap and write them unit-by-unit to the output file. The pre-v5
> implementation
> +# did not need to worry about this because it could just rewrite the
> input file's string
> +# offsets table in one chunk. The test therefore ensures that llvm-dwp
> rewrites the
> +# string offsets table contributions in the order they appear in the
> input file section,
> +# and not in the order in which their corresponding CUs appear in the
> index table.
> ----------------
> dblaikie wrote:
> > wolfgangp wrote:
> > > dblaikie wrote:
> > > > Oh... I hadn't realized/understood/thought about this.
> > > >
> > > > That's kind of awkward - mixing blobs with headers and blobs without
> headers in the same section (str_offsets) & then having to use the CU/TUs
> to disambiguate/dictate how to parse those chunks.
> > > >
> > > > Can we avoid that? Could we just say v4 and v5 are incompatible?
> Have a flag or something that checks.
> > > >
> > > > Then we could always walk the str_offsets alone, either expecting
> header'd sections (which I hope are self descriptive - once you know you're
> reading a v5 str_offsets, you don't need to consult the CU for anything to
> do that?) or non-header'd sections (where you just treat every word as a
> string offset without consideration for how those are divided into
> contributions)
> > > >
> > > >
> > > > (Ah, I see you mentioned/highlighted that in the patch description
> too)
> > > > That's kind of awkward - mixing blobs with headers and blobs without
> headers in the same section (str_offsets) & then having to use the CU/TUs
> to disambiguate/dictate how to parse those chunks.
> > > >
> > > > Can we avoid that? Could we just say v4 and v5 are incompatible?
> Have a flag or something that checks.
> > > >
> > > That would have been the simpler solution, but llvm-dwarfdump is
> already handling the same thing, so I thought it would be inconsistent if
> llvm-dwp didn't handle it too. We also would have to reject any dwp input
> files with mixed units that were created by non-llvm tools.
> > >
> > > It would certainly be easy to reject any mixing of v5 and v4 units.
> > > Pity, though, since it's already working...
> > I'd be happy to hear some other people's opinions on this (Adrian &
> Paul?).
> >
> > It seems pretty unfortunate to have this split between versions making a
> bunch of complexity to support like this.
> The sticking point for me is whether there are objects in the wild that
> use GNU-style .debug_str_offsets, and might get linked with proper v5
> .debug_str_offsets.  The linker will combine them without a second thought,
> and it's unreasonable to ask a linker to verify.
> I guess... given that you need to enable this stuff explicitly for gcc
> with DWARF v4, and the point is to reduce relocations which is an
> irrelevant consideration for fission, it's maybe not so likely that we have
> objects like this in the wild.  That makes it reasonable to put our
> collective foot down and say llvm-dwp won't mix-n-match.
>
> I do think it's worthwhile for llvm-dwarfdump to handle the mixed case,
> partly because it's already done and partly because it's a different kind
> of tool (diagnostic/analysis rather than production).
>
>
> https://reviews.llvm.org/D42937
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180227/913873c8/attachment.html>