[PATCH] D83050: [DebugInfo] Add more checks to parsing .debug_pub* sections.

Mon Jul 6 12:54:07 PDT 2020

ikudrin added inline comments.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFDebugPubTable.cpp:27-28
   Sets.clear();
-  DataExtractor::Cursor C(0);
-  while (C && Data.isValidOffset(C.tell())) {
+  uint64_t Offset = 0;
+  while (Data.isValidOffset(Offset)) {
+    uint64_t SetOffset = Offset;
----------------
jhenderson wrote:
> ikudrin wrote:
> > jhenderson wrote:
> > > What's behind the reasoning for no longer using the `Cursor` throughout?
> > The method now reports all encountered errors through `RecoverableErrorHandler` and does not return `Error`. The `Cursor` requires its error state to be checked in any case. While the former code could simply return the error state, now this checking is a bit inconvenient, and, moreover, useless.
> I'm not sure I follow.
> 
> As far as my understanding of `Cursor` goes, you can have:
> 
> ```
>   DataExtractor::Cursor C(0);
>   while (C && Data.isValidOffset(C.tell())) {
>     // Parse the length
>     if (!C) { /* report invalid length, using C.takeError() */ return; }
>     // Parse the header
>     while (C) { /* parse entries */ }
>     if (C && C.tell() != Offset) { /* report bad terminator */ }
>   }
>   if (!C) { /* report parsing error using C.takeError() */
> ```
> 
> The `Cursor` is checked by either the final error check outside the loop in most cases, or by the invalid length report, so we're good (note that `C.takeError()` does not need calling if the `Cursor` is in a success state, much like `Expected`). The only case where it might be different is if `Cursor` is in an error state due to some error other than a running-off-the-end error, in which case it would abort early. If you want to continue instead, you could do almost the same as you've got:
> 
> ```
>   while (Offset) {
>     DataExtractor::Cursor C(Offset);
>     ... = Data.getInitialLength(C);
>     if (!C) { /* report invalid length, using C.takeError() */ return; }
>     // Parse the header
>     while (C) { /* parse entries */ }
>     if (C && C.tell() != Offset) { /* report bad terminator */ }
>     if (!C) { /* report parsing error using C.takeError() */
>   }
> ```
> 
> I'm not sure I see how the latter is any more complex or inconvenient than instantiating a different Error variable and passing pointers around?
I'll take the second one, thanks!

================
Comment at: llvm/test/tools/llvm-dwarfdump/X86/debug_pub_tables_invalid.s:1
 # RUN: llvm-mc -triple x86_64 %s -filetype=obj -o %t
 # RUN: not llvm-dwarfdump -debug-pubnames %t 2>&1 | FileCheck %s
----------------
jhenderson wrote:
> I'd probably fold in this test case now into the other file. I don't think there's any benefit having them separate. Alternatively, this lives separately, and move the other test case into the library testing. The idea is that we test the code in detail with the library tests, and at a high level in the tool tests (i.e. showing we handle the reported output). I don't mind either approach.
OK, I'll move that into the new test. I find using gtest unit tests for things like dumping and error reporting clumsy because they require lots of boilerplate code.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83050/new/

https://reviews.llvm.org/D83050