[PATCH] D42021: [DWARF] v5 implementation of string offsets tables - producer side

Wolfgang Pieb via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 17 18:42:27 PST 2018


wolfgangp added inline comments.


================
Comment at: lib/CodeGen/AsmPrinter/DwarfDebug.cpp:604-605
+  // Create the symbol that designates the start of the unit's contribution
+  // to the string offsets table. In a split DWARF scenario, only the skeleton
+  // unit has the DW_AT_str_offsets_base attribute (and hence needs the symbol).
+  if (useSegmentedStringOffsetsTable()) {
----------------
dblaikie wrote:
> This looks like it's inconsistent with the code.
> 
> The code looks like it sets the symbol for both the skeleton and non-skeleton holders in the case of split DWARF, doesn't it?
> 
> Perhaps it should be:
> 
>   if (useSplitDwarf)
>     SkeletonHolder...
>   else
>     InfoHolder...
> 
> ?
> 
> (or that could be rewritten as:
> 
>   (useSplitDwarf ? SkeletonHolder : InfoHolder).setStringOffsetsStartSym(...);
Right. I wasn't an issue since the label wasn't used but it was created unnecessarily.


================
Comment at: lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2060-2062
+  if (useSegmentedStringOffsetsTable())
+    NewCU.addStringOffsetsStart();
+
----------------
dblaikie wrote:
> This is duplicated in a couple of places - could it be reasonably moved into some common place, like the DwarfUnit ctor? (not sure if that has some common setup code for adding attributes common to all units - maybe there is no such common place? not sure)
It's a bit awkward, since we don't know in the DwarfUnit ctor whether we are dealing with a split unit. And even in the DwarfCompileUnit ctor we don't know whether we're constructing a Skeleton unit or a (possibly split) compile unit. The only ctor we could move it to would be the DwarfTypeUnit ctor, but that would still only cover one of the 3 cases. Perhaps there is a way to refactor all this but I don't see a good way to centralize it at the moment.


================
Comment at: lib/CodeGen/AsmPrinter/DwarfDebug.h:265
+  /// Whether to use the DWARF v5 string offsets table.
+  bool UseSegmentedStringOffsetsTable;
+
----------------
dblaikie wrote:
> What's "segmented" mean/imply/add/clarify in this context?
I meant to indicate by the name that the DWARF v5 offsets tables is a sequence of contributions preceded by a headers rather than one simple array of string offsets. I added a comment.


================
Comment at: lib/CodeGen/AsmPrinter/DwarfUnit.cpp:251
+    unsigned Index = StringPoolEntry.getIndex();
+    if ((Index & 0xff) == Index)
+      IxForm = dwarf::DW_FORM_strx1;
----------------
dblaikie wrote:
> Would these comparisons be better written as "Index >= 0xff", etc?
Looks simpler. I reversed the order to check for strx4 first with the rationale that with a very large number of strings the majority will be strx4 and so we hit that first in the comparison chain.


================
Comment at: test/DebugInfo/Generic/string-offsets-multiple-cus.ll:49
+; BOTH-NOT:    {{DW_TAG|NULL}}
+; BOTH:        DW_AT_str_offsets_base [DW_FORM_sec_offset] (0x[[CU1_STROFF:[0-9a-f]+]])
+;
----------------
dblaikie wrote:
> Maybe rather than naming it CU1_STROFF, name it STROFFBASE or something, since it's meant to be the same for all the units?
Right.


================
Comment at: test/DebugInfo/Generic/string-offsets-multiple-cus.ll:71-72
+;
+; Verify that the type unit has the proper DW_AT_str_offsets_base attribute and 
+; a segment in the .debug_str_offsets section.
+;
----------------
dblaikie wrote:
> Is each type unit getting its own string offset? I'd expect type units to use the string offset table of the CU they were attached to, so this should be the same as CU1_STROFF?
> 
> I mean it's a tradeoff - using separate string offset sections per TU means they can be dropped if this copy of the TU is dropped. But it also means more duplication - the same string offset having to be duplicated in the CU and every TU if the same string appears in all/several of them.
At the moment all the units in a compilation share the same string offsets table. I corrected all the xxx_STROFF lit variables. I'm not sure if it's really worth it to drop the string offsets that belong to a dropped unit. I would think the gain would be small.


================
Comment at: test/DebugInfo/Generic/string-offsets-multiple-cus.ll:110-111
+; TYPEUNITS-NEXT: {{.*:}}
+; The string with index 6 should be "bglob"
+; TYPEUNITS-NEXT: {{.*:}} [[BGLOBOFF]]
+
----------------
dblaikie wrote:
> Would it be worth printing out the index number in the section dumping to make it easier to look them up? Then the matching wouldn't need 5 trivial matches, you could match the index from the unit dumping to the string here. (& similarly in the non-TU case above)
That would require an independent change to the dumper. I would prefer to do this in a follow-on patch, it that's OK, along with improving the test case.


https://reviews.llvm.org/D42021





More information about the llvm-commits mailing list