[llvm-dev] [RFC][DWARF] Handling errors in DWARF .debug_line parsing

Tue Dec 17 02:07:33 PST 2019

Having recently done something similar for .debug_loc(lists) (a much
simpler data structure) I can certainly appreciate the complexity and
commitment necessary to pull something like that off.

Anyway, I just wanted to say that I think that adding new fields to the
returned Errors (or even creating completely new ErrorInfo types) seems
perfectly reasonable to me -- I think that's one of the main reasons for
introducing the Error class in the first place.

cheers,
pl

On 17/12/2019 10:48, James Henderson wrote:
> Thanks for the comments. I should have done a bit more experimenting
> before writing up this RFC, it seems :) After sending it off, I played
> around with the existing code, tor refresh my memory of what I did a
> while back in adding the Recoverable/Unrecoverable Error callbacks.
> These are written with the intent that the client could decide how to
> handle errors, as Jonas points out, making the additional strictness
> level a little unnecessary. Unrecoverable errors stop the parsing,
> whereas recoverable allow it to continue as best it can. I think the
> current parser is stricter than it needs to be to provide useful
> information, so I'm going to put a patch up soon to loosen it up a bit
> before I start adding more checks.
> 
> The Errors reported by the parser only contain strings, which means that
> it's difficult for a client to use them in some other way, for example
> to identify issues that it might need to know/completely ignore etc.
> Short of adding extra data to the Error, such as an enum, it's unclear
> to me how to improve that situation. Of course, if we did add the extra
> data, it would allow LLDB and other clients to decide what errors to
> ignore/pay attention to on a finer grained basis, which partially
> achieves the lazy creation goal (only report the problems that apply to
> the things the client cares about), but doesn't improve responsiveness
> since the whole program will still be parsed.
> 
> Lazy creation of errors is tricky, at least for some cases, given the
> current code state. The problem is that the parser reads the name table,
> calculates the rows etc as it parses a program, and that's the only way
> of getting this data through the current interface. The only laziness
> available as things stand is to skip line tables that we don't want
> (e.g. to find the line table at a given offset). This ignores the body
> of the skipped programs. The alternative is to refactor the parser
> completely into much smaller pieces, which provide access to the desired
> bits individually, and delays reading until that point. I can definitely
> see benefits in this approach, and I think it would solve the needs of
> things at both ends of the scale (verifiers and dumpers) but it's a lot
> more work, which I'm not convinced I personally can take on, although
> I'd be happy to review it if others fancy attempting that option.
> 
> James
> 
> On Tue, 17 Dec 2019 at 08:36, Pavel Labath via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
>     On 16/12/2019 23:47, David Blaikie via llvm-dev wrote:
>     >
>     >
>     > On Mon, Dec 16, 2019 at 2:40 PM Jonas Devlieghere
>     <jonas at devlieghere.com <mailto:jonas at devlieghere.com>
>     > <mailto:jonas at devlieghere.com <mailto:jonas at devlieghere.com>>> wrote:
>     >
>     >     Hey James,
>     >
>     >     This sounds really interesting. A few things that come to mind:
>     >
>     >      - I'm curious what kind of errors you'd be okay with in dump
>     mode but
>     >     not in consumer mode. From LLDB's perspective, we probably want to
>     >     extract as much information as possible from a malformed line
>     table,
>     >     as long as we don't end up lying of course.
>     >      - I need to take another look at the code, but don't we
>     already have
>     >     something similar where the consumer can decide how to deal
>     with these
>     >     kinds of issues? I'm bringing this up because I don't think we
>     really
>     >     need different parsing modes. I think we need to let the consumer
>     >     decide what to do with the potential errors. The verifier in
>     dwarfdump
>     >     would presumably stop after the first error (~strict mode)
>     while the
>     >     dumper would just move on. Would the parsing modes then be a
>     set of
>     >     "presets" with regards to how to handle these issues?
>     >      - I'm in favor of creating these error messages lazily,
>     especially
>     >     for LLDB where we care about responsiveness. However, that does
>     >     conflict with the behavior you want for the DWARF verifier. We
>     >     probably want a way to control this?
>     >
>     >
>     > For my part - I'd imagine the verifier would be an aggressive
>     reader of
>     > DWARF - it'd use the same lazy/high quality error API, but the
>     verifier
>     > code itself would try to walk all of the parts of the DWARF to
>     manifest
>     > any lazy error cases, rather than needing any other codepath in
>     the parser.
> 
>     That would be my ideal setup as well (disclaimer: I work on lldb) --
>     have the parser provide sufficient information so that the verification
>     can be done from the "outside".
> 
>     That is, if the goal is to have stronger verification of generated line
>     tables -- it's not fully clear (to me) whether that's your main
>     motivation for adding these checks.
> 
>     regards,
>     pavel
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>