[PATCH] D74560: Introduce DWARFDataExtractor::getInitialLength

Thu Feb 20 07:17:31 PST 2020

labath marked 3 inline comments as done.
labath added a comment.

Sorry about the delay (and thank you for the prompt response). It took a while before this got to the top of my stack.

I'll split this patch up with my next update, which will come when I figure out how to improve the error messages. My initial motivation for putting this into a single patch was to make the impact of this kind of change more obvious (I expected the change in error messages to be mildly controversial), but in retrospect, that could have also been achieved with separate linked patches. Having super-descriptive error messages is somewhat at odds with making the code reusable, but I think we can come up with a reasonable compromise.

================
Comment at: llvm/include/llvm/DebugInfo/DWARF/DWARFDataExtractor.h:45
+
+  std::pair<uint64_t, dwarf::DwarfFormat> getInitialLength(Cursor &C) const {
+    return getInitialLength(&getOffset(C), &getError(C));
----------------
ikudrin wrote:
> labath wrote:
> > ikudrin wrote:
> > > Would it be simpler if you implement `getInitialLength(uint64_t *, Error *)` via `getInitialLength(Cursor &)`? You anyway create a Cursor inside `getInitialLength(uint64_t *, Error *)`.
> > I don't think that would help very much (or at all). The second cursor comes from a desire to not modify the original offset in case we fail to read the entire initial length field. I am not sure if any of the potential users would really need this behavior, but that is how the rest of the DataExtractor functions operate, so I thought it'd be good to maintain that.
> > 
> > If we stick to that requirement, then I'd still need to have two cursors (or offsets, or something) and fiddle around with them. If we drop it, we wouldn't need the second Cursor even in the current setup.
> It looks like that behavior is not stated. I hardly can remember if I saw it was used anywhere, but once I fixed an issue which was (partially) caused by it, D71704. I guess we can easily drop it if that helps to simplify the code, maybe adding a warning note in the comments of the methods.
This behavior is stated (and painstakingly reiterated) on every DataExtractor method. It looks like **DWARF**DataExtractor does not have these comments, but then this class is much more lightly documented. Nonetheless, I think it would be good to preserve this behavior, as a matter of consistency.

================
Comment at: llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test:53
 # FATAL-NEXT: Line table prologue
-# FATAL-NEXT: total_length: 0xfffffffe
+# FATAL-NEXT: total_length: 0x00000000
 # FATAL-NOT:  debug_line
----------------
jhenderson wrote:
> I don't think this is a good change in behaviour. We should still be able to dump the total length field, with the reserved value.
Actually, I would say that here we should not print out the prologue at all (just some error about invalid initial length). Since the length field is not valid, it's hard to say whether we have any line table here at all -- and yet we will print out the whole prologue full of zeroes (even before this patch).

================
Comment at: llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test:195

-# RESERVED: warning: parsing line table prologue at offset 0x00000048 unsupported reserved unit length found of value 0xfffffffe
+# RESERVED: warning: Unsupported reserved unit length of value 0xfffffffe at offset 0x48

----------------
jhenderson wrote:
> This is a regression in the quality of the error, as it is not clear what section this error comes from. Same goes for many of the other changes in errors.
When this is run normally on the command line (so both stdout and stderr go to the terminal), this comes out as:
```
debug_line[0x00000048]
warning: Unsupported reserved unit length of value 0xfffffffe at offset 0x48
...
```
So, the section where this comes from should be fairly obvious. Nonetheless, it should be pretty easy to prepend the error message with the section name. Maybe something like "warning: parsing line table prologue at offset 0x00000048: unsupported reserved unit length found of value 0xfffffffe" or "warning: parsing line table prologue: unsupported reserved unit length of value 0xfffffffe at offset 0x48" ?

The question boils down to whether it should be the DataExtractor who embeds the offset into the message, or the line table code. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74560/new/

https://reviews.llvm.org/D74560