[PATCH] D51081: [DWARF v5] Refactoring range list dumping to fold DWARF v4 functionality into v5 handling (almost NFC).

Tue Aug 28 17:45:13 PDT 2018

wolfgangp added inline comments.

================
Comment at: include/llvm/DebugInfo/DWARF/DWARFListTable.h:52
+  bool empty() const {
+    return Entries.empty() || Entries.begin()->isEndOfList();
+  }
----------------
dblaikie wrote:
> wolfgangp wrote:
> > dblaikie wrote:
> > > wolfgangp wrote:
> > > > probinson wrote:
> > > > > dblaikie wrote:
> > > > > > Why this new special case - when would the range list contain an empty range?
> > > > > It's explicitly legal to have a list that consists solely of the end-of-list marker.  Not saying any producer would actually emit it this way, but it's one way to describe an empty range.  The parser should handle this case smoothly.
> > > > Also , the DWARF5 rnglist implementation keeps the end-of-list entry in the list (unlike the DWARF4 implementation), which makes display a bit more straightforward.
> > > Good point - I guess I could narrow/clarify my comment somewhat: Why is the emptiness of a range list important? I'd figure the printing algorithm would work the same either way.
> > >Good point - I guess I could narrow/clarify my comment somewhat: Why is the emptiness of a > range list important? I'd figure the printing algorithm would work the same either way.
> > 
> > This was motivated by the code in DwarfLinker.cpp (dsymutil) that needs to check for the emptiness of a list during range patching. With the end-of-list entry as part of the Entries vector it seemed prudent to provide this method.
> Could this be addressed by keeping the DWARF4 end-of-list entry too, for consistency?
> 
> Though I'm still a bit confused - I'm not sure why DwarfLinker.cpp would need to worry about whether the list is empty or not? Maybe as an optimization, but otherwise I'd figure it would just iterate over the list & if it's empty, it would iterate over zero things.
> 
> & why does this change you're making cause this extra case to be added - I thought this code (DWARFListtype) was for DWARF5 rnglists already & now it was getting DWARF4 support - but it sounds like you had to add a case that is for the DWARF5 case... so I'm a bit confused?
> Could this be addressed by keeping the DWARF4 end-of-list entry too, for consistency?

Actually, that is what's happening with this change. I should have said "unlike the original DWARF4 implementation" in my previous comment.

> Though I'm still a bit confused - I'm not sure why DwarfLinker.cpp would need to worry
> about whether the list is empty or not? Maybe as an optimization, but otherwise I'd figure it
> would just iterate over the list & if it's empty, it would iterate over zero things.

Looking at the code it seems that in the context in question it performs a remapping of ranges and only seems to look at the first entry of a range list to determine whether it can validly remap the entire range list. It is not iterating over the entries, so I guess it just wants to check whether there is a valid first entry in the first place.

> & why does this change you're making cause this extra case to be added - I thought this 
> code (DWARFListtype) was for DWARF5 rnglists already & now it was getting DWARF4 
> support - but it sounds like you had to add a case that is for the DWARF5 case

Actually, this is dsymutil handling the .debug_ranges section, so basically DWARF 4 rangelists. 
With this patch DWARFDebugRnglist is now handling both DWARF 4 and DWARF 5 ranges, and since we're keeping the end-of-list entry in the list the extra case had to be added, since we didn't do this previously in the DWARF4 code.

As for dsymutil handling DWARF5 rnglists (i.e. the .debug_rnglists section), that is not supported at this point.

================
Comment at: include/llvm/DebugInfo/DWARF/DWARFListTable.h:72-77
+  /// Print a header for the list. Returns the indentation that is to be used
+  /// when subsequently dumping each list entry.
+  unsigned printListHeader(raw_ostream &OS, uint8_t AddrSize,
+                           DIDumpOptions DIDumpOpts) const {
+    return 0;
+  }
----------------
dblaikie wrote:
> wolfgangp wrote:
> > dblaikie wrote:
> > > This seems to be dead in this patch - maybe remove it from here and add it in whichever patch adds its use?
> > Actually, it's used further down in DWARFListTableBase<DWARFListType>::dump
> I guess perhaps I was thinking: this function seems to be a no-op & should be removed from this patch - and added back in when it's used/needed?
OK. It will be added back in later, because the corresponding function for the location lists will do something more meaningful.

================
Comment at: include/llvm/DebugInfo/DWARF/DWARFListTable.h:87-90
+    /// The DWARF version number. We assume version 5 until we either extract
+    /// the version from the section or explicitly set it (e.g. to fake a
+    /// table for DWARF v4 and earlier).
+    uint16_t Version = 5;
----------------
dblaikie wrote:
> wolfgangp wrote:
> > dblaikie wrote:
> > > That sounds like in all cases we know the version from the input, right? So when/why would this assumption be needed?
> > During extraction we're using getTableLength() (I've renamed it from just 'length()' in the first version of this review) for validation and also to determine if we can recover from an error and keep parsing the section. With the creation of an artificial table for DWARF 4 range lists the calculation of the table length is now dependent on the version, since for DWARF 5 we need to add the size of the length field to the length we extracted. In DWARF 4 the length is simply the size of the section as is. 
> > If we find an error in the table header before we get to extract the version, the version would obviously not be set to anything meaningful and would confuse our recovery in DWARFContext.cpp. Also, presetting the version allows us to use getTableLength() before we actually extract the version for validation.
> > But it's perhaps more appropriate to preset the version during extraction, so I'm doing it there now.
> > 
> I'm pretty confused by this - how does having a hardcoded (possibly incorrect) version help with error recovery? I guess it's correct based on the section being extracted from (though in the future it could be incorrect when we get to DWARFv6)?
> 
> What sort of recovery would be done when the version couldn't be read from the input?
Error recovery (in dumpListSection() in DWARFContext.cpp) relies on the correct interpretation of the length field in the header. In DWARF 5 we need to add sizeof(uint32_t) to the extracted length, in DWARF 4 the length field has simply been set to the section size and already represents the length.

If we find errors before we extract the version, e.g. while validating the extracted length against section size etc., we return without knowing the version, and getLength( ) doesn't know what to do. Hence I wanted to preset the version until we extract it from the data. 

The other relevant scenario is when we extract an invalid version from the data. In that case we report the error but need to set the version to something valid in order to have the length interpreted correctly.

I suppose we could handle this differently by providing a different interface to get the table length for recovery purposes. We always know the version in the context, so we would never have to preset the table version during extraction. Do you have any other suggestions?

https://reviews.llvm.org/D51081