[PATCH] D34765: [DWARF] [NFC] Move a couple of member functions to DWARFUnit (baseclass) from DWARFCompileUnit (derived class)

Pieb, Wolfgang via llvm-commits llvm-commits at lists.llvm.org
Thu Jun 29 14:43:40 PDT 2017



From: David Blaikie [mailto:dblaikie at gmail.com]
Sent: Thursday, June 29, 2017 1:53 PM
To: Robinson, Paul; reviews+D34765+public+e78f04ee01ee41e1 at reviews.llvm.org; Pieb, Wolfgang; aprantl at apple.com
Cc: llvm-commits at lists.llvm.org
Subject: Re: [PATCH] D34765: [DWARF] [NFC] Move a couple of member functions to DWARFUnit (baseclass) from DWARFCompileUnit (derived class)


On Thu, Jun 29, 2017 at 12:42 PM Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>> wrote:
Just to finish this thought, DW_AT_str_offsets_base is spec'd to point to the first entry for the unit, not the header, so the first contribution is at 8 or 16 depending on format (picky picky).  That is, the entire section has only one header,

I'm not sure how that would work in a non-fission, non-DWARF aware linker situation. Presumably each str_offsets section, with its header, would be concatenated together - so there would be one header per contribution, not one header for the whole section.

In a non-fission situation there is a str_offsets_base attribute in each unit and the linker will just relocate them. Concatenation of the str_offsets sections should just work.

In a regular one-source file compilation you would presumably have one CU and possibly multiple TUs, each with their own contribution. Also, I assumed that in an LTO-type partial link there could be multiple CUs in one compilation, producing multiple contributions to the str_offsets  section. With the str_offsets_base attribute I don’t see any problems with non-fission compilation.

Equally, when merging DWO files into a DWP, if each DWO has offsets relative to its str_offsets contribution - then the whole contribution (including the header) must be taken from each DWO (otherwise the offsets would have to be rewritten, which would require parsing and modifying the DIEs, etc, which DWP avoids doing for performance & is the reason for str_offsets).

With DWP we have the index tables that can help with the adjustment. Each DWO file gets its own (index table) contribution to the str_offsets.dwo section, and the DWARF consumer uses it to get at the correct offset. Note that ‘Contribution’ is overloaded here. DWP contribution (recorded in the index tables) vs. string offsets table contribution.

What DWP will have to do is to adjust the string offsets in the str_offsets.dwo section when it is concatenating (like it does now), but it has to be aware of the str_offsets contribution headers. That’s why it’s important to clarify the question whether they’re actually supposed to exist in the str_offsets.dwo sections. What’s more, another question is whether we want to support mixing DWARF v5 CUs with current V4 CUs in a split scenario, i.e. an LTO-type link links 2 files, one compiled with –gdwarf5 and the other one with –gdwarf4 and uses split dwarf for the result.

rather than one header per compilation-unit or type-unit; you get one header + table per translation-unit.  (Too many kinds of unit!)

What distinction between compilation/type unit and translation unit are you making? DWARF doesn't, as far as I know, have a definition of Translation Unit (& I've always basically modeled it as C++ Translation Unit ~= DWARF Compilation Unit (& yeah, Type Units are a bit weird/out there, I think of them as not being owned by any particular Compilation Unit, etc))

I think Paul was referring to a translation unit as an IR file that gets compiled down to a .o file. That IR file may have been composed from a number of  smaller IR files via LTO and thus contain multiple DWARF CUs (and TUs).

-- wolfgang


  It seems not completely clear that it would be a net savings to maintain separate string pools (or string-offset pools) per unit, so I don't know whether you'd really want str_offsets_base in .dwo units.  Someone will have to run off and measure it at some point.

Oh, yeah, I'm not suggesting it'd totally be a win/necessary, just that it seems easy enough to leave it up to the implementation about how granular they get - allow a default for str_offsets_base (I'd be OK if there is no default) in Split DWARF, but let it be able to be specified (I think it'd fall out pretty naturally from consumers if it was specified anyway).

It's pretty awkward that the str_offsets_base points to the beginning of the table of strings, not the header... :/

--paulr

From: llvm-commits [mailto:llvm-commits-bounces at lists.llvm.org<mailto:llvm-commits-bounces at lists.llvm.org>] On Behalf Of David Blaikie via llvm-commits
Sent: Thursday, June 29, 2017 2:51 PM
To: reviews+D34765+public+e78f04ee01ee41e1 at reviews.llvm.org<mailto:reviews%2BD34765%2Bpublic%2Be78f04ee01ee41e1 at reviews.llvm.org>; Pieb, Wolfgang; aprantl at apple.com<mailto:aprantl at apple.com>
Cc: llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
Subject: Re: [PATCH] D34765: [DWARF] [NFC] Move a couple of member functions to DWARFUnit (baseclass) from DWARFCompileUnit (derived class)


On Thu, Jun 29, 2017 at 11:48 AM Paul Robinson via Phabricator <reviews at reviews.llvm.org<mailto:reviews at reviews.llvm.org>> wrote:
probinson added a comment.

In https://reviews.llvm.org/D34765#795601, @wolfgangp wrote:

> No you're right, my bad. Units in the .dwo sections (both type and CU) don't have a str_offsets_base, which implies that the .debug_str_offsets.dwo section has to consist of a monolithic table of string offsets (without the 8 or 16-byte header that's specified in section 7.26 of the DWARF 5 standard).  Section 7.26 seems to say the opposite, though. It seems I'll have to clarify this with the DWARF5 folks.


Worth clarifying on the dwarf-discuss list but I believe the idea is that the .dwo would have a single .debug_str_offsets.dwo contribution (complete with header), corresponding to the .debug_str.dwo contribution, and all units in the compilation would share it just like they would ordinarily share the single .debug_str section in a non-split compilation.

Yeah, seems to me like DWO CUs basically get a "DW_AT_str_off_base" (or whatever it's called) should be assumed/implicit 0, but can be present (if a producer wants to put multiple separate str_off for separate CUs in a single DWO - for example, to reduce the size of the str offsets (smaller numbers, shorter encoding, etc) in cases of many strings/many CUs, etc)



https://reviews.llvm.org/D34765

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170629/db243fb1/attachment.html>


More information about the llvm-commits mailing list