[PATCH] D72155: [DebugInfo] Make incorrect debug line extended opcode length non-fatal

Fri Jan 10 02:59:07 PST 2020

jhenderson added a comment.

In D72155#1813355 <https://reviews.llvm.org/D72155#1813355>, @dblaikie wrote:

> Out of curiosity: what's your broader goal with this work? (it'll help understand what's in-scope and out of scope, and better understand the framing when reviewing changes)

There's two parts to this:

1. Making it easier for consumers to continue and try to do something with slightly bad output. One of the problems with using the unrecoverable errors is that it prevents people even trying to iterate over later tables.
2. (this one's not specific to this patch, but some other patches in this area I've been doing are related): we have some local code that uses the DebugInfo library to read a line table. In order for this code to be sound, we need to make sure the line table makes sense. The parser in its current state doesn't pick up on a number of bad situations, so we added some local patches to detect these other errors (local because we didn't have time at that point to try to push them into the open source). We'd like to now get these into the opensource library, to avoid merge conflicts.

>> The parser now continues by assuming the larger of the claimed length and parsed length is correct.
> 
> This seems like a bit of a stretch to guess at which length is the one that's correct. I don't think there's a solid basis to choose either, and if the wrong one is chosen you could start reading from the middle of other opcodes & things, I would imagine, resulting in many follow-on error messages due to apparently further malformed input?

I was unsure what to do here, if I'm honest. The counter-argument to your point is that you might not start reading from the middle of things, and that it was just the one field that was bad for whatever reason. Essentially, I think you have fpur options in this sort of situation: a) treat the error as unrecoverable - this makes it impossible to parse later tables, regardless of the state of things, but would prevent other, potentially spurious, errors from coming out; b) use a "stated length wins" approach; c) use a "parsed length wins" approach; d) use a "largest length wins" approach. The right decision is going to differ in any given situation, but without attempting to look ahead and see what makes the most sense if parsing were to continue (which seems like it would be unnecessarily complicated), we have to just pick one. Previously, when I implemented these errors, I took approach a) on the basis of "you can't know what to do, so better not allow continuation, since it could lead to spurious information", but it was pointed out more recently to me that if we do this, it becomes impossible for people to see any later information at all.

I don't really have a strong opinion on this or the similar "bigger wins" patches I've been making. I'm happy to go with whatever the consensus is.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72155/new/

https://reviews.llvm.org/D72155