[PATCH] D72155: [DebugInfo] Make incorrect debug line extended opcode length non-fatal

Mon Jan 27 02:28:35 PST 2020

jhenderson marked an inline comment as done.
jhenderson added a comment.

Sorry for the delayed response. Last week was pretty crazy, so I didn't get a chance to continue with this.

In D72155#1833571 <https://reviews.llvm.org/D72155#1833571>, @probinson wrote:

> In D72155#1827856 <https://reviews.llvm.org/D72155#1827856>, @dblaikie wrote:(out of curiosity, though not necessary, who is "we"?)
>
> >
>
>
> "We" is Sony, here.  James is part of our binutils/linker team.  We have local linker patches to edit the line table in response to GC'ing functions.  Also doing other fixups to alert the debugger to GC'd functions, but the part where we actually parse something is for the line table.

Thanks for clarifying @probinson!

> 
> 
>> So you'd like/are working on making libDebugInfoDWARF more pedantic/precise/detect more errors? Specifically only for the line table? Do you have any interest in adding fuzz testing for this, or are your needs less severe than would justify fuzzing?
> 
> Mostly we care about dealing with a possibly malformed/corrupt line-number program for one CU, and soldiering on to the next one.  The middle of a user's edit/build/debug cycle is not really the optimal time to abort and whine about a malformed lump of debug info.
>  I don't know if the linker team has contemplated fuzzing in this area.

Like @probinson said, much of our work has been focused on the line table, because we mess about with it in relation to GC-sections work in the linker, so hence why I've been focusing on that area. We need to detect dodgy output to avoid us doing bad stuff when doing this manipulation of the line tables, and soldiering on if we do find bad things is better than failing the link completely. For other DWARF sections, we currently have other strategies which don't require parsing the sections, so we're not so interested in the other areas, at least within our team, at this point (although making llvm-dwarfdump and llvm-symbolizer more robust/useful on bad inputs etc is always beneficial, just not such high priorities). Fuzzing might eventually be useful, but we're mostly interested in the low-hanging fruit as shown by some older testing we had, so haven't looked at it at this point.

================
Comment at: llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp:686-687
       }
       // Make sure the stated and parsed lengths are the same.
-      // Otherwise we have an unparseable line-number program.
-      if (*OffsetPtr - ExtOffset != Len)
-        return createStringError(errc::illegal_byte_sequence,
-                           "unexpected line op length at offset 0x%8.8" PRIx64
-                           " expected 0x%2.2" PRIx64 " found 0x%2.2" PRIx64,
-                           ExtOffset, Len, *OffsetPtr - ExtOffset);
+      // If they aren't, continue from the claimed length.
+      uint64_t End = ExtOffset + Len;
----------------
dblaikie wrote:
> Might be worth finding some clearer words here - this mentions 3 lengths ("stated", "parsed", and "claimed") - I see "claimed" is the same as "stated", so at least collapsing those two to use the same word would help clarify this. But in theory all the lengths are parsed, so "parsed" isn't completely unambiguous. Not sure what the perfect words would be here by any means.
I'll change this to something better. How about "Make sure the length as recorded in the table and the standard length for the opcode match. If they don't, continue from the end as claimed by the table"?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72155/new/

https://reviews.llvm.org/D72155