[llvm-dev] [RFC][DWARF] Handling errors in DWARF .debug_line parsing
Pavel Labath via llvm-dev
llvm-dev at lists.llvm.org
Tue Dec 17 02:07:33 PST 2019
Having recently done something similar for .debug_loc(lists) (a much
simpler data structure) I can certainly appreciate the complexity and
commitment necessary to pull something like that off.
Anyway, I just wanted to say that I think that adding new fields to the
returned Errors (or even creating completely new ErrorInfo types) seems
perfectly reasonable to me -- I think that's one of the main reasons for
introducing the Error class in the first place.
cheers,
pl
On 17/12/2019 10:48, James Henderson wrote:
> Thanks for the comments. I should have done a bit more experimenting
> before writing up this RFC, it seems :) After sending it off, I played
> around with the existing code, tor refresh my memory of what I did a
> while back in adding the Recoverable/Unrecoverable Error callbacks.
> These are written with the intent that the client could decide how to
> handle errors, as Jonas points out, making the additional strictness
> level a little unnecessary. Unrecoverable errors stop the parsing,
> whereas recoverable allow it to continue as best it can. I think the
> current parser is stricter than it needs to be to provide useful
> information, so I'm going to put a patch up soon to loosen it up a bit
> before I start adding more checks.
>
> The Errors reported by the parser only contain strings, which means that
> it's difficult for a client to use them in some other way, for example
> to identify issues that it might need to know/completely ignore etc.
> Short of adding extra data to the Error, such as an enum, it's unclear
> to me how to improve that situation. Of course, if we did add the extra
> data, it would allow LLDB and other clients to decide what errors to
> ignore/pay attention to on a finer grained basis, which partially
> achieves the lazy creation goal (only report the problems that apply to
> the things the client cares about), but doesn't improve responsiveness
> since the whole program will still be parsed.
>
> Lazy creation of errors is tricky, at least for some cases, given the
> current code state. The problem is that the parser reads the name table,
> calculates the rows etc as it parses a program, and that's the only way
> of getting this data through the current interface. The only laziness
> available as things stand is to skip line tables that we don't want
> (e.g. to find the line table at a given offset). This ignores the body
> of the skipped programs. The alternative is to refactor the parser
> completely into much smaller pieces, which provide access to the desired
> bits individually, and delays reading until that point. I can definitely
> see benefits in this approach, and I think it would solve the needs of
> things at both ends of the scale (verifiers and dumpers) but it's a lot
> more work, which I'm not convinced I personally can take on, although
> I'd be happy to review it if others fancy attempting that option.
>
> James
>
> On Tue, 17 Dec 2019 at 08:36, Pavel Labath via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>
> On 16/12/2019 23:47, David Blaikie via llvm-dev wrote:
> >
> >
> > On Mon, Dec 16, 2019 at 2:40 PM Jonas Devlieghere
> <jonas at devlieghere.com <mailto:jonas at devlieghere.com>
> > <mailto:jonas at devlieghere.com <mailto:jonas at devlieghere.com>>> wrote:
> >
> > Hey James,
> >
> > This sounds really interesting. A few things that come to mind:
> >
> > - I'm curious what kind of errors you'd be okay with in dump
> mode but
> > not in consumer mode. From LLDB's perspective, we probably want to
> > extract as much information as possible from a malformed line
> table,
> > as long as we don't end up lying of course.
> > - I need to take another look at the code, but don't we
> already have
> > something similar where the consumer can decide how to deal
> with these
> > kinds of issues? I'm bringing this up because I don't think we
> really
> > need different parsing modes. I think we need to let the consumer
> > decide what to do with the potential errors. The verifier in
> dwarfdump
> > would presumably stop after the first error (~strict mode)
> while the
> > dumper would just move on. Would the parsing modes then be a
> set of
> > "presets" with regards to how to handle these issues?
> > - I'm in favor of creating these error messages lazily,
> especially
> > for LLDB where we care about responsiveness. However, that does
> > conflict with the behavior you want for the DWARF verifier. We
> > probably want a way to control this?
> >
> >
> > For my part - I'd imagine the verifier would be an aggressive
> reader of
> > DWARF - it'd use the same lazy/high quality error API, but the
> verifier
> > code itself would try to walk all of the parts of the DWARF to
> manifest
> > any lazy error cases, rather than needing any other codepath in
> the parser.
>
> That would be my ideal setup as well (disclaimer: I work on lldb) --
> have the parser provide sufficient information so that the verification
> can be done from the "outside".
>
> That is, if the goal is to have stronger verification of generated line
> tables -- it's not fully clear (to me) whether that's your main
> motivation for adding these checks.
>
> regards,
> pavel
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
More information about the llvm-dev
mailing list