[PATCH] D49214: [DWARF v5] emit DWARF v5 range lists (no support for fission yet)

Wolfgang Pieb via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 16 12:00:59 PDT 2018


wolfgangp added inline comments.


================
Comment at: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:424
+    if (DD->getDwarfVersion() >= 5)
+      addRnglistsBase();
+  }
----------------
dblaikie wrote:
> It looks like this would end up adding a range list base address to the CU for every range emitted - wouldn't it?
> 
> Yeah, looks like it does:
> 
>               DW_AT_rnglists_base       (0x0000000c)
>               DW_AT_low_pc      (0x0000000000000000)
>               DW_AT_ranges      (0x0000002b
>                  [0x0000000000000000, 0x0000000000000033)
>                  [0x0000000000000040, 0x0000000000000046)
>                  [0x0000000000000000, 0x0000000000000006))
>               DW_AT_rnglists_base       (0x0000000c)
> 
> using this input:
> 
>   bool x;
>   void f1() {
>     f1();
>     if (bool b = x) {
>       f1();
>       f1();
>     }
>   }
>   __attribute__((section(".text.foo"))) void f2() { }
>   void f3() { }
> 
> Compile with dwarf5 to LLVM IR, reorder the 3 calls to f1 (inserting the first call between the second and third) to cause the scope of 'b' to have a hole in the middle (for that first call to f1) & so to be described by a range list rather than high/low pc.
> 
> Looks like the range list base attribute should be added in the same place DW_AT_GNU_ranges_base is set, I think?
Ah, thanks for finding this. I guess I missed that duplicates are not eliminated when attributes are added to a DIE. AT_rnglists_base should only be added if there is any use of the AT_ranges attribute and it seemed simpler to do this here but it's just driven by the number of ranges in the unit anyway. I'll fix this in a separate review since this one is already closed.




================
Comment at: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2060-2068
+  // Don't emit a rangelist table if there are no ranges.
+  if (llvm::all_of(CUMap,
+                   [](const decltype(CUMap)::const_iterator::value_type &Pair) {
+                     DwarfCompileUnit *TheCU = Pair.second;
+                     if (auto *Skel = TheCU->getSkeleton())
+                       TheCU = Skel;
+                     return TheCU->getRangeLists().empty();
----------------
dblaikie wrote:
> Is it the best tradeoff to produce one shared range list, rather than a separate range list per CU? (I ask/think about that here, because then you wouldn't need a scan through all the CUs to see if any are non-empty)
> 
> Looks like the tradeoff is whether the bytes spent in the header are saved by reduced bytes in the debug_info section of being able to use smaller indexes, encoded with shorter encodings. I would imagine in practice for builds with multiple CUs (optimized/LTO situations) there are many ranges and the tradeoff tips in favor of using multiple separate range list lists - one for each CU?
If we use DW_FORM_rnglistx with DW_AT_ranges, we would definitely save space in .debug_info when the number of ranges gets large, which is where it matters most. Perhaps  I will add a FIXME comment effectively saying that when we support DW_FORM_rnglistx (along with the offset table), we'll switch to one rnglist table per CU.


================
Comment at: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2092-2095
+  // FIXME: Generate the offsets table and use DW_FORM_rnglistx with the
+  // DW_AT_ranges attribute. Until then set the number of offsets to 0.
+  Asm->emitInt32(0);
+  Asm->OutStreamer->EmitLabel(RnglistTableBaseSym);
----------------
dblaikie wrote:
> To be honest, I'm not sure why the standard added this offset table - it seems to add extra bytes without any savings I can see (shorter encodings in the debug_info section - using indexes instead of offsets, but it doesn't save relocations or bytes overall, unless I'm missing something?). Meh. (@probinson - any ideas on what the purpose of this indirection is?)
My understanding is that DW_AT_ranges has to either use FORM_sec_offset (which needs to be relocated) or FORM_rnglistx, which is an index into the offset table. So with the offset table and FORM_rnglistx we'd be saving the relocations in .debug_info.


================
Comment at: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2115-2123
   if (!useRangesSection()) {
     assert(llvm::all_of(
                CUMap,
                [](const decltype(CUMap)::const_iterator::value_type &Pair) {
                  return Pair.second->getRangeLists().empty();
                }) &&
            "No debug ranges expected.");
----------------
dblaikie wrote:
> Looks like the precondition that there are some ranges is already tested here - so no need to check it in emitDebugRnglists? the check there could be another assert, perhaps?
I thought the check here was only used in case the -no-dwarf-ranges-section option was on,  so we still need the check for the normal case.


================
Comment at: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2130-2144
   // Start the dwarf ranges section.
   Asm->OutStreamer->SwitchSection(
       Asm->getObjFileLowering().getDwarfRangesSection());
 
   // Grab the specific ranges for the compile units in the module.
   for (const auto &I : CUMap) {
     DwarfCompileUnit *TheCU = I.second;
----------------
dblaikie wrote:
> Might be worth splitting this out into a separate function so it's symmetric with emitDebugRnglists?
Ok, sounds good.


================
Comment at: llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2222
 // DW_AT_low_pc, DW_AT_high_pc, DW_AT_ranges, DW_AT_dwo_name, DW_AT_dwo_id,
-// DW_AT_addr_base, DW_AT_ranges_base.
+// DW_AT_addr_base, DW_AT_ranges_base or DW_AT_rnglists_base.
 DwarfCompileUnit &DwarfDebug::constructSkeletonCU(const DwarfCompileUnit &CU) {
----------------
dblaikie wrote:
> Missing oxford comma between 'DW_AT_ranges_base' and 'or'. Though it also reads a little ambiguously - in English 'or' can often be read as 'exclusive or' (why some people put 'and/or').
> 
> Maybe "This DIE can have any/all of the following attributes: x, y, and z"? No big deal, just some thoughts.
Seems like this comment needs to be reworked for DWARF 5 anyway with some information on  what is DWARF 5 specific and what's not. It also should go into the header, no? Let me take a stab at that in another review.


================
Comment at: llvm/trunk/test/DebugInfo/X86/rnglists-nobase.ll:11-19
+; class A
+; {
+; public:
+;   A();
+; };
+; 
+; static A glob;
----------------
dblaikie wrote:
> This example might obscure what's really going on by having implicit functions from the global static initializer - it may be more illustrative to provide a more explicit example using section attributes to show which functions are in which sections, etc.
> 
> The existence of 3 functions wouldn't itself create ranges (if they were all in a single .text section they'd be covered by a single high/low pc), it's the fact that the static initializer for 'glob' creates functions in a .text.startup section. And that those two static initializer functions aren't emitted together - one, then 'foo', then the other (splitting them up - otherwise there'd be only two ranges, instead of 3)
> 
>   __attribute__((section(".text.foo"))) void f1() {}
>   __attribute__((section(".text.bar"))) void f2() {}
>   __attribute__((section(".text.foo"))) void f3() {}
> 
> and I'd expect the optimal range list (in the absence of a debug_addr section) for this source to be:
> 
>   DW_RLE_base_address (.text.foo+0)
>   DW_RLE_offset_pair (f1 begin/end relative to .text.foo+0)
>   DW_RLE_offset_pair (f3 begin/end relative to .text.foo+0)
>   DW_RLE_start_length (f2 begin, f2 length)
> 
> (aside: shorter encodings might be achievable (though maybe not sufficiently worthwhile/only a small improvement) if RLE_start_length implicitly set the base address, and if offset_pair was offset_length instead - but at least there's room to improve this over versions (there's versioning in the header, there's space in the RLE_* enum space, etc))
> 
> I think the reason this is missing the base_address+offset_pair opportunity is because the code is predicated on UseDwarfRangesBaseAddressSpecifier - which defaults to off, because I found some DWARF consumers couldn't handle it, I think. But that property isn't relevant to DWARFv5 - so the implementation should be modified to do the same thing no matter the value of 'UseDwarfRangesBaseAddressSpecifier'.
I'll rework the example in a separate review.


Repository:
  rL LLVM

https://reviews.llvm.org/D49214





More information about the llvm-commits mailing list